landscape · patterns

Agentic engineering patterns

living document · last updated 2026-06-05 · 7d ago

2026-06-05

last updated

age

sections

tracked rows

Adopt — proven, use now

5 / 5

Pattern/Tool	Evidence
MCP as standard protocol	Every major agent supports it. Universal adapter. Codex MCP Apps P1+P2. Pinterest: 66K invocations/month in production. 8,600+ servers. SurePath AI shipping MCP-specific governance.
Spec-driven development workflow	GitHub Spec Kit (84.7K stars), AWS Kiro, 30+ frameworks mapped. Delta Airlines: 1,948% growth in AI tool adoption using specs.
Plan-before-act	All four major CLI agents have it. Table stakes. Differentiation moved to multi-agent orchestration.
Sandbox-first execution	Universal across CLI agents. Gemini adding dynamic expansion (Windows/Linux). Codex adding deny-list mode alongside allow-list.
Git worktree-based parallel agent execution	Cursor (8 parallel), Claude Code (16+), Windsurf (5), Grok Build (8), OMX wrapping Codex. Gemini v0.37.0 preview adds worktree support.

Agentic workflow tooling (mise-versions ecosystem)

6 / 6

Tool	What it does	Why agents benefit
ripgrep (rg)	Fast recursive search	Claude Code uses it internally. Faster search = faster agent turns.
fd	Fast file finder	Replaces `find`. Used by agents for file discovery.
jq	JSON processor	Agents manipulate JSON constantly — API responses, config, MCP output.
shellcheck	Shell script linter	Catches agent-generated bash errors before they execute.
hk	Git hooks manager	Structures agent commit workflows. Lighter than lefthook/husky.
fnox	Encrypted secret manager	Safe credential handling in agent environments.

Quality-of-life — improves the human-agent collaboration:

5 / 5

Tool	What it does	Why it helps
bat	Better `cat` with syntax highlighting	Agent-generated file output readable in terminal.
delta	Better `git diff`	Clearer diffs when reviewing agent code changes.
fzf	Fuzzy finder	Interactive selection in agentic workflows.
eza	Better `ls`	Tree views of agent-modified directories.
zoxide	Smart `cd`	Fast navigation between agent worktrees.

Trial — working in production, still evolving

17 / 17

Pattern/Tool	Evidence
Platform-level agent orchestration	Codex v0.119.0 ships the platform: MCP Apps, WebRTC realtime, 8+ extracted crates, remote exec-server. Gemini has GCP backend. Scion cross-vendor. Codex v0.120.0 adds background agent streaming.
Enterprise deployment as competitive axis	Every agent shipped enterprise features Apr 8-11: Claude Code (Vertex wizard, Perforce, CA trust, team onboarding), Codex (residency, approval workflows), OpenCode (OAuth MCP, fast mode multi-model). Maps to Nate's "five durable layers" thesis.
Agent governance tooling	Microsoft Governance Toolkit: OWASP MCP Top 10, SOC 2 mapping, tool injection scanning. SurePath AI: MCP-specific runtime policy engine. Pinterest at 66K/month proves the governance need.
Agent portability / BYOK	Copilot CLI: BYOK + Ollama/vLLM + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. The portability sprint.
ACP — editor↔agent standard protocol (NEW Jun 5)	Zed's Agent Client Protocol (Apache, open; JetBrains × Zed collab Oct 2025; live ACP Registry). The editor-side mirror of MCP: hosts (Zed, JetBrains, Kiro, Devin Desktop) ↔ guests (Claude Agent, Codex, Gemini CLI, Pi, OpenCode, Vibe). Live-protocol signature Jun 4–5: Vibe bumped `agent-client-protocol` to 0.10.1 (+`session/delete`); OpenCode shipped ACP replay + cancel fixes. Commoditizes the agent into an interchangeable guest; the host owns the user relationship. Trial→Adopt candidate — already multi-vendor. NOT Cognition's, despite the Jun-2 Devin Desktop launch framing.
Portable/replayable session as object (NEW Jun 5)	The session becoming a movable, reconstructable, auditable artifact. OpenCode v1.16.0: move sessions between workspaces/dirs, clone keeping dirty files, full replay on load. Claude Code v2.1.163: bg sessions update in place keeping running tasks across version upgrades. Vibe v2.14.0: session deletion as first-class ACP method. Replay = auditability = the precondition for trusting unattended fleets. The layer beneath the fleet cockpit.
Composable agent SDKs	Copilot SDK v0.2.1: cross-language commands + UI elicitation (JS/TS, Python, Go, .NET). BYOK, W3C tracing.
GitHub Spec Kit	Open-source spec-driven scaffolding, 84.7K stars, supports 14+ agent platforms.
AWS Kiro	Spec-driven agentic IDE on Bedrock (Claude Sonnet 4.0/3.7). GovCloud available.
Agentic harness engineering	Anthropic: "2026 is the year of harnesses." Same model scores 17 problems apart in different agents. Claude Code's 512K+ lines prove it.
Heterogeneous model routing	Frontier for reasoning, mid-tier for standard tasks, small models for high-frequency execution. Gemini adding dynamic routing for 3.1 Pro/Flash Lite.
Hook-based automation	Claude Code's PreToolUse/PostToolUse/Stop with conditional filtering, defer/resume. Channels and Conway may supplement/replace.
Human-at-checkpoints	Agents build full systems autonomously, pausing only for strategic review. Anthropic's three-agent harness: planner/generator/evaluator.
Path-based multi-agent addressing	Codex spawn v2 dropped agent IDs for path-based addressing (`/root/agent_a`). Agent tree IS the address space. Fire-and-forget messaging + feedback cascade.
Session quality as primary battleground (promoted Apr 15, UPDATED Apr 16)	Re-entry stack built (Apr 14-15). Surface expansion follows (Apr 16): Claude Code fullscreen TUI, Codex marketplace + memory lifecycle, Zed focus-follows-mouse + dev containers. The shared plumbing is built; now each tool uses it to differentiate. The session is the unit of quality — and the session is becoming a richer application, not just a prompt.
CLI → terminal application shift (NEW Apr 16)	Claude Code `/tui fullscreen` (flicker-free rendering, scroll control, modal UI). Codex `Ctrl+R` history search. The CLI agent is becoming a terminal application you inhabit. Not an IDE — a richer terminal.
Super-app convergence (UPDATED Apr 19)	Two paths to the same destination in the same 48hr window (Apr 16-17). Anthropic vertical: Opus 4.7 → Claude Code → Claude Design → Managed Agents → Claude for Word → Conway → API. Six surfaces, one model provider. Build all the apps. Codex lateral: "for almost everything" — computer use (background agents on macOS), 90+ plugins, in-app browser, memory. Use all the apps through one interface. Neither vendor acknowledged the other. Counter-thesis: Nate's BYOC says both approaches are context traps.

Assess — investigate, understand implications

25 / 25

Pattern/Tool	Evidence
Frontier models as systemic risk	Anthropic's Mythos (93.9% SWE-bench, autonomous zero-day discovery) triggered Treasury/Fed emergency meeting with bank CEOs (Apr 8). Model capability now treated as financial-sector systemic risk. New deployment pattern: directed use-case access, not open API. Security hardening becomes regulatory, not optional.
Multi-agent orchestration in the model	Meta Muse Spark: multi-agent orchestration built into the model itself, not the harness. "Contemplating mode" runs a squad of agents in parallel. Agents-in-the-model vs agents-around-the-model.
Open-weight contraction	Meta went proprietary with Muse Spark after Llama 1-4 open-weight. Open-weight now depends on Google (Gemma), Alibaba (Qwen), Zhipu (GLM), and community. Llama's future unclear.
Self-improving review agents	Cursor Bugbot learns from PR feedback, applies learned rules to future reviews. MCP tools for context. 78% resolution rate. Cursor v3.1 (Apr 13): tiled layout for parallel agents, upgraded voice input. Multi-agent parallelism as first-class UX.
Agent-addressable slash commands (NEW Apr 15)	Claude Code v2.1.108 (Apr 14): Skill tool can invoke built-in slash commands (/init, /review, /security-review). Blurs human-UX / agent-UX line. Policy question emerging: should org settings restrict which slash commands the agent can invoke on itself? Watch for: other agents' slash/tool equivalents opening to the model.
Agent execution runtimes	Anthropic Managed Agents (April 8-9): YAML definitions, sandboxed execution, persistent sessions, $0.08/session-hour. Beta with Notion, Asana, Rakuten, Sentry. Conway CNW ZIP may be the extension format. Codex: remote exec-server. Gemini: GCP backend. Model providers becoming execution platforms.
Persistent agent platforms	Conway CNW ZIP details via Nate's analysis: standalone workspace, webhook activation, browser control, proprietary extension format. Channels (shipped). Codex: remote control + WebRTC. Gemini: GCP backend + Interactions API. Google Scion as external orchestration.
MCP Apps ecosystem	Codex MCP Apps P1+P2 (meta to tool call results). 8,600+ servers. Pinterest 66K/month in production. SurePath AI governance. MCP Server Cards (.well-known) proposed.
Agents as supply chain participants	Dependabot-to-agent assignment for security remediation. Copilot Critic agent (uses Claude to review plans). OXC copilot-swe-agent contributing fixes. Agents managing security, not just generating code.
Cross-vendor agent orchestration	Google Scion: open-source, runs Claude+Codex+Gemini in parallel with container isolation. Copilot Studio: multi-model broker (5 models). OMX: community Codex orchestration.
Multi-model broker platforms	Copilot Studio GA with Claude Opus 4.6, Sonnet 4.5, Grok 4.1, GPT-5.3/5.4. Microsoft positioning as model-agnostic orchestrator.
Vendor cost-optimization as trust risk (NEW Apr 16)	Anthropic reduced default effort to medium to save tokens. Users noticed quality decline. Fortune, Axios, VentureBeat, The Register covering it (Apr 13-16). Boris Cherny acknowledged. First time a vendor's cost-optimization decision became a public trust issue. Connects to subsidy question and Nate's trust layer.
Vendor surface control	Anthropic claiming all interaction surfaces. OpenClaw ban as enforcement. But 3 days of silence since — unclear if strategy is expanding or pausing.
Meta JiT Testing	LLM generates tests per-PR by analyzing diff. No persistent test suite. 70% reduction in human review load.
Agent-to-Agent protocols + payments	A2A v1.0 (April 9): first stable spec. Multi-protocol, enterprise multi-tenancy, 5 production SDKs (Python, JS, Java, Go, .NET). 150+ orgs, 22K+ stars. AP2: 60+ orgs. Visa ICC: neutral payment layer for 4 protocols. McKinsey: $5T agent-driven sales by 2030.
Agent supply chain attacks	OpenClaw ClawHavoc: 824+ malicious skills (growing), 135K exposed instances. CVE-2026-35669 (CVSS 8.8, Apr 10) privilege escalation. First attack targeting agent execution patterns specifically. AMOS macOS stealer via agentic workflows. Claw Code (72K stars): clean-room Claude Code clone from source map leak. Axios npm compromise (Apr 10): North Korea-linked, affected OpenAI's macOS CI/CD via floating dep tag. Antipattern: floating tags in GitHub Actions.
oh-my-codex (OMX)	2.8K stars overnight. Community multi-agent orchestration for Codex CLI.
Mamba-Transformer hybrids for agents	Nemotron 3 Nano claims 5x throughput. Linear context scaling. If verified, changes local agent architecture.
KV cache compression for local inference	Google TurboQuant: 6x KV cache compression, zero accuracy loss, no retraining. ICLR 2026. llama.cpp integration exists (`turboquant_plus`, Metal support). Changes local inference economics: existing GPUs serve 6x more context. Combined with Copilot BYOK, strengthens case for local-first agent architecture.
Devin / Cognition Labs	$10.2B valuation, $150M ARR (with Windsurf). Real capability but unclear ROI.
sauna.ai (Wordware)	Largest YC seed ($30M), Instacart/Runway customers. Nate's test: scored 1/4 on knowledge-work tasks.
Agent memory systems / context portability (UPDATED Apr 18)	Gemini Chapters, project-level memory, Nate's "Open Brain" PostgreSQL+MCP pattern. Context persistence is the bottleneck. NEW: Nate names context portability as structural problem — "memory is the moat." Proposes BYOC (Bring Your Own Context). Four loss points: tool switch, company mandate, job change, platform terms. Six months of use = qualitative output difference.
Agent infrastructure hardening (NEW Apr 18)	Epsilla survey (Apr 14): GAIA (local agent execution), Kontext CLI (JIT credential broker), SnapState (checkpoint/resume), Context Surgeon (agent self-manages context), OQP (testable assertions about agent behavior). Shift from "agents that chat" to "agents that persist, authenticate, and manage their own cognition."
Background agent swarms	Multiple small agents running continuously on tiny local models.
Two-tier plugin distribution	Codex: curated (vetted, backend-hosted) + community (non-curated). Plugin marketplace economics forming.

Watch — early signal, track for developments

16 / 16

Pattern/Tool	Evidence
Agent-native devtools (NEW Apr 30)	Vite DevTools v0.1.16 ships `devframe`: "Framework-neutral devtools foundation + agent-native MCP." First major dev tooling to ship MCP as first-class feature. Combined with antfu's Claude Opus 4.7 co-authorship pattern. Agents aren't just using tools — tools are being built for agents.
Purpose-built coding models (NEW Apr 30)	Poolside Laguna XS.2 (33B/3B MoE, Apache 2.0, 68.2% SWE-Bench) — first open-weight model architecturally designed for agentic coding. Paired with `pool` terminal agent. Mistral Medium 3.5 (128B dense, 77.6% SWE-Bench) — merged flagship for remote agents in Vibe. The model layer is no longer general-purpose models adapted for coding — dedicated coding models are arriving.
Dependency graph as agent interface (NEW Apr 30)	aube v1.5.0 `aube query` — vlt-inspired selector-based dependency graph queries with JSON output. Package managers exposing their graph for programmatic consumption, not just human-readable output. Combined with mise `aube_args`, the dev environment stack is adapting to agents as consumers.
Post-ban community migration	ZeroClaw (Rust), NullClaw, local models. OpenClaw community adopting Kimi K2.5. Credits expire April 17 (3 days).
OWASP MCP Top 10	New compliance standard from MS Governance Toolkit. Maps agentic AI risks to MCP-specific controls. May become de facto standard.
AI workspace consolidation	Sauna, Notion AI, Glean. Crowded, no winner.
NIST AI Agent Identity standards	Comment period closed April 2. IAM frameworks for autonomous agents.
EU AI Act enforcement	August 2, 2026 — first major enforcement date. High-risk AI, GPAI, foundation model requirements.
Harness economics	OpenClaw ban proved the arbitrage model is unsustainable. Credits expired April 17 — no vendor positioned. Mutual silence held 28+ days. Codex pricing restructured: $20/$100/$200 tiers, token-based. Subsidy era ending industry-wide.
Full autonomous dev without checkpoints	The "Devin promise." Evidence still mixed.
AI industry financial sustainability	Zitron: "subprime AI crisis" + "AI isn't too big to fail."
Agent-native IDEs	Is the IDE the agent, or does the agent use the IDE? Conway suggests the agent becomes the IDE.
Model-routing layers	Automatic model selection per task complexity.
Agent security monitoring	Codenotary AgentMon, Astrix Security, Black Duck Signal, Palo Alto Prisma AIRS 3.0. Security tooling wave forming.
AI agents as contributors	copilot-swe-agent contributed two OXC bug fixes (latest: node_modules config walker skip). Copilot Critic agent uses Claude to review plans. AI agents contributing to and reviewing tooling that other AI agents use.
Agent security vulnerabilities	CVE-2026-35022 (CVSS 9.8, Claude CLI/SDK command injection). Claude Code deny-rules bypass at 50+ subcommands. Security of agent tools becoming a distinct attack surface.

Key risk signal: The subsidy question

The builder community describes a genuine productivity revolution. The financial analysis shows unstable foundations:

5 / 5

Metric	Evidence
Anthropic compute vs revenue	$10B spent on compute, $5B revenue
OpenAI inference burn	$8.67B through Sept 2025 on $4.3B total revenue
Startup unit economics	$3-13 burned per $1 of subscription revenue
Data center gap	~5GW under construction vs. 12GW+ promised
Harness arbitrage	5x gap between subscription and API costs — now closed by ban

Sources

15 / 15

Source	URL	Focus
Nate's Newsletter	natesnewsletter.substack.com	AI practitioner strategy, MCP, workflow optimization
Where's Your Ed At	wheresyoured.at	AI financial sustainability critique
Anthropic 2026 Agentic Coding Trends	resources.anthropic.com	Harness patterns, industry data
Anthropic Harness Design Blog	anthropic.com/engineering	Three-agent harness architecture
Meta JiTTests	engineering.fb.com	Testing paradigm shift
GitHub Spec Kit	github.com/github/spec-kit	Spec-driven scaffolding
GitHub Copilot SDK	github.com/github/copilot-sdk	Composable agent runtime
AWS Kiro	kiro.dev	Spec-driven agentic IDE
Google ADK	google.github.io/adk-docs	Model-agnostic agent framework
Microsoft Agent Governance Toolkit	github.com/microsoft/agent-governance-toolkit	Cross-vendor agent governance
Google Scion	github.com/GoogleCloudPlatform/scion	Cross-vendor agent orchestration
SurePath AI	surepath.ai	MCP-specific runtime governance
Pinterest MCP (InfoQ)	infoq.com/news/2026/04/pinterest-mcp-ecosystem	Enterprise MCP case study
SDD Framework Map	Medium (30+ frameworks)	Landscape map
NIST AI Agent Standards	csrc.nist.gov	Regulatory direction

Full document — prose, analysis, and everything not in the tables above

Agentic engineering patterns

Living document. Rewritten as the field evolves. Last updated: 2026-06-05.

Technology radar

Adopt — proven, use now

Pattern/Tool	Evidence
MCP as standard protocol	Every major agent supports it. Universal adapter. Codex MCP Apps P1+P2. Pinterest: 66K invocations/month in production. 8,600+ servers. SurePath AI shipping MCP-specific governance.
Spec-driven development workflow	GitHub Spec Kit (84.7K stars), AWS Kiro, 30+ frameworks mapped. Delta Airlines: 1,948% growth in AI tool adoption using specs.
Plan-before-act	All four major CLI agents have it. Table stakes. Differentiation moved to multi-agent orchestration.
Sandbox-first execution	Universal across CLI agents. Gemini adding dynamic expansion (Windows/Linux). Codex adding deny-list mode alongside allow-list.
Git worktree-based parallel agent execution	Cursor (8 parallel), Claude Code (16+), Windsurf (5), Grok Build (8), OMX wrapping Codex. Gemini v0.37.0 preview adds worktree support.

Agentic workflow tooling (mise-versions ecosystem)

CLI tools that multiply agent effectiveness. Source: mise-versions.jdx.dev.

Core — agents use these directly:

Tool	What it does	Why agents benefit
ripgrep (rg)	Fast recursive search	Claude Code uses it internally. Faster search = faster agent turns.
fd	Fast file finder	Replaces `find`. Used by agents for file discovery.
jq	JSON processor	Agents manipulate JSON constantly — API responses, config, MCP output.
shellcheck	Shell script linter	Catches agent-generated bash errors before they execute.
hk	Git hooks manager	Structures agent commit workflows. Lighter than lefthook/husky.
fnox	Encrypted secret manager	Safe credential handling in agent environments.

Quality-of-life — improves the human-agent collaboration:

Tool	What it does	Why it helps
bat	Better `cat` with syntax highlighting	Agent-generated file output readable in terminal.
delta	Better `git diff`	Clearer diffs when reviewing agent code changes.
fzf	Fuzzy finder	Interactive selection in agentic workflows.
eza	Better `ls`	Tree views of agent-modified directories.
zoxide	Smart `cd`	Fast navigation between agent worktrees.

mise itself is the substrate — reproducible environments via mise.toml, task runner replacing Makefiles, per-project env vars. 400+ tools in the registry.

Trial — working in production, still evolving

Pattern/Tool	Evidence
Platform-level agent orchestration	Codex v0.119.0 ships the platform: MCP Apps, WebRTC realtime, 8+ extracted crates, remote exec-server. Gemini has GCP backend. Scion cross-vendor. Codex v0.120.0 adds background agent streaming.
Enterprise deployment as competitive axis	Every agent shipped enterprise features Apr 8-11: Claude Code (Vertex wizard, Perforce, CA trust, team onboarding), Codex (residency, approval workflows), OpenCode (OAuth MCP, fast mode multi-model). Maps to Nate’s “five durable layers” thesis.
Agent governance tooling	Microsoft Governance Toolkit: OWASP MCP Top 10, SOC 2 mapping, tool injection scanning. SurePath AI: MCP-specific runtime policy engine. Pinterest at 66K/month proves the governance need.
Agent portability / BYOK	Copilot CLI: BYOK + Ollama/vLLM + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. The portability sprint.
ACP — editor↔agent standard protocol (NEW Jun 5)	Zed’s Agent Client Protocol (Apache, open; JetBrains × Zed collab Oct 2025; live ACP Registry). The editor-side mirror of MCP: hosts (Zed, JetBrains, Kiro, Devin Desktop) ↔ guests (Claude Agent, Codex, Gemini CLI, Pi, OpenCode, Vibe). Live-protocol signature Jun 4–5: Vibe bumped `agent-client-protocol` to 0.10.1 (+`session/delete`); OpenCode shipped ACP replay + cancel fixes. Commoditizes the agent into an interchangeable guest; the host owns the user relationship. Trial→Adopt candidate — already multi-vendor. NOT Cognition’s, despite the Jun-2 Devin Desktop launch framing.
Portable/replayable session as object (NEW Jun 5)	The session becoming a movable, reconstructable, auditable artifact. OpenCode v1.16.0: move sessions between workspaces/dirs, clone keeping dirty files, full replay on load. Claude Code v2.1.163: bg sessions update in place keeping running tasks across version upgrades. Vibe v2.14.0: session deletion as first-class ACP method. Replay = auditability = the precondition for trusting unattended fleets. The layer beneath the fleet cockpit.
Composable agent SDKs	Copilot SDK v0.2.1: cross-language commands + UI elicitation (JS/TS, Python, Go, .NET). BYOK, W3C tracing.
GitHub Spec Kit	Open-source spec-driven scaffolding, 84.7K stars, supports 14+ agent platforms.
AWS Kiro	Spec-driven agentic IDE on Bedrock (Claude Sonnet 4.0/3.7). GovCloud available.
Agentic harness engineering	Anthropic: “2026 is the year of harnesses.” Same model scores 17 problems apart in different agents. Claude Code’s 512K+ lines prove it.
Heterogeneous model routing	Frontier for reasoning, mid-tier for standard tasks, small models for high-frequency execution. Gemini adding dynamic routing for 3.1 Pro/Flash Lite.
Hook-based automation	Claude Code’s PreToolUse/PostToolUse/Stop with conditional filtering, defer/resume. Channels and Conway may supplement/replace.
Human-at-checkpoints	Agents build full systems autonomously, pausing only for strategic review. Anthropic’s three-agent harness: planner/generator/evaluator.
Path-based multi-agent addressing	Codex spawn v2 dropped agent IDs for path-based addressing (`/root/agent_a`). Agent tree IS the address space. Fire-and-forget messaging + feedback cascade.
Session quality as primary battleground (promoted Apr 15, UPDATED Apr 16)	Re-entry stack built (Apr 14-15). Surface expansion follows (Apr 16): Claude Code fullscreen TUI, Codex marketplace + memory lifecycle, Zed focus-follows-mouse + dev containers. The shared plumbing is built; now each tool uses it to differentiate. The session is the unit of quality — and the session is becoming a richer application, not just a prompt.
CLI → terminal application shift (NEW Apr 16)	Claude Code `/tui fullscreen` (flicker-free rendering, scroll control, modal UI). Codex `Ctrl+R` history search. The CLI agent is becoming a terminal application you inhabit. Not an IDE — a richer terminal.
Super-app convergence (UPDATED Apr 19)	Two paths to the same destination in the same 48hr window (Apr 16-17). Anthropic vertical: Opus 4.7 → Claude Code → Claude Design → Managed Agents → Claude for Word → Conway → API. Six surfaces, one model provider. Build all the apps. Codex lateral: “for almost everything” — computer use (background agents on macOS), 90+ plugins, in-app browser, memory. Use all the apps through one interface. Neither vendor acknowledged the other. Counter-thesis: Nate’s BYOC says both approaches are context traps.

Assess — investigate, understand implications

Pattern/Tool	Evidence
Frontier models as systemic risk	Anthropic’s Mythos (93.9% SWE-bench, autonomous zero-day discovery) triggered Treasury/Fed emergency meeting with bank CEOs (Apr 8). Model capability now treated as financial-sector systemic risk. New deployment pattern: directed use-case access, not open API. Security hardening becomes regulatory, not optional.
Multi-agent orchestration in the model	Meta Muse Spark: multi-agent orchestration built into the model itself, not the harness. “Contemplating mode” runs a squad of agents in parallel. Agents-in-the-model vs agents-around-the-model.
Open-weight contraction	Meta went proprietary with Muse Spark after Llama 1-4 open-weight. Open-weight now depends on Google (Gemma), Alibaba (Qwen), Zhipu (GLM), and community. Llama’s future unclear.
Self-improving review agents	Cursor Bugbot learns from PR feedback, applies learned rules to future reviews. MCP tools for context. 78% resolution rate. Cursor v3.1 (Apr 13): tiled layout for parallel agents, upgraded voice input. Multi-agent parallelism as first-class UX.
Agent-addressable slash commands (NEW Apr 15)	Claude Code v2.1.108 (Apr 14): Skill tool can invoke built-in slash commands (/init, /review, /security-review). Blurs human-UX / agent-UX line. Policy question emerging: should org settings restrict which slash commands the agent can invoke on itself? Watch for: other agents’ slash/tool equivalents opening to the model.
Agent execution runtimes	Anthropic Managed Agents (April 8-9): YAML definitions, sandboxed execution, persistent sessions, $0.08/session-hour. Beta with Notion, Asana, Rakuten, Sentry. Conway CNW ZIP may be the extension format. Codex: remote exec-server. Gemini: GCP backend. Model providers becoming execution platforms.
Persistent agent platforms	Conway CNW ZIP details via Nate’s analysis: standalone workspace, webhook activation, browser control, proprietary extension format. Channels (shipped). Codex: remote control + WebRTC. Gemini: GCP backend + Interactions API. Google Scion as external orchestration.
MCP Apps ecosystem	Codex MCP Apps P1+P2 (meta to tool call results). 8,600+ servers. Pinterest 66K/month in production. SurePath AI governance. MCP Server Cards (.well-known) proposed.
Agents as supply chain participants	Dependabot-to-agent assignment for security remediation. Copilot Critic agent (uses Claude to review plans). OXC copilot-swe-agent contributing fixes. Agents managing security, not just generating code.
Cross-vendor agent orchestration	Google Scion: open-source, runs Claude+Codex+Gemini in parallel with container isolation. Copilot Studio: multi-model broker (5 models). OMX: community Codex orchestration.
Multi-model broker platforms	Copilot Studio GA with Claude Opus 4.6, Sonnet 4.5, Grok 4.1, GPT-5.3/5.4. Microsoft positioning as model-agnostic orchestrator.
Vendor cost-optimization as trust risk (NEW Apr 16)	Anthropic reduced default effort to medium to save tokens. Users noticed quality decline. Fortune, Axios, VentureBeat, The Register covering it (Apr 13-16). Boris Cherny acknowledged. First time a vendor’s cost-optimization decision became a public trust issue. Connects to subsidy question and Nate’s trust layer.
Vendor surface control	Anthropic claiming all interaction surfaces. OpenClaw ban as enforcement. But 3 days of silence since — unclear if strategy is expanding or pausing.
Meta JiT Testing	LLM generates tests per-PR by analyzing diff. No persistent test suite. 70% reduction in human review load.
Agent-to-Agent protocols + payments	A2A v1.0 (April 9): first stable spec. Multi-protocol, enterprise multi-tenancy, 5 production SDKs (Python, JS, Java, Go, .NET). 150+ orgs, 22K+ stars. AP2: 60+ orgs. Visa ICC: neutral payment layer for 4 protocols. McKinsey: $5T agent-driven sales by 2030.
Agent supply chain attacks	OpenClaw ClawHavoc: 824+ malicious skills (growing), 135K exposed instances. CVE-2026-35669 (CVSS 8.8, Apr 10) privilege escalation. First attack targeting agent execution patterns specifically. AMOS macOS stealer via agentic workflows. Claw Code (72K stars): clean-room Claude Code clone from source map leak. Axios npm compromise (Apr 10): North Korea-linked, affected OpenAI’s macOS CI/CD via floating dep tag. Antipattern: floating tags in GitHub Actions.
oh-my-codex (OMX)	2.8K stars overnight. Community multi-agent orchestration for Codex CLI.
Mamba-Transformer hybrids for agents	Nemotron 3 Nano claims 5x throughput. Linear context scaling. If verified, changes local agent architecture.
KV cache compression for local inference	Google TurboQuant: 6x KV cache compression, zero accuracy loss, no retraining. ICLR 2026. llama.cpp integration exists (`turboquant_plus`, Metal support). Changes local inference economics: existing GPUs serve 6x more context. Combined with Copilot BYOK, strengthens case for local-first agent architecture.
Devin / Cognition Labs	$10.2B valuation, $150M ARR (with Windsurf). Real capability but unclear ROI.
sauna.ai (Wordware)	Largest YC seed ($30M), Instacart/Runway customers. Nate’s test: scored 1/4 on knowledge-work tasks.
Agent memory systems / context portability (UPDATED Apr 18)	Gemini Chapters, project-level memory, Nate’s “Open Brain” PostgreSQL+MCP pattern. Context persistence is the bottleneck. NEW: Nate names context portability as structural problem — “memory is the moat.” Proposes BYOC (Bring Your Own Context). Four loss points: tool switch, company mandate, job change, platform terms. Six months of use = qualitative output difference.
Agent infrastructure hardening (NEW Apr 18)	Epsilla survey (Apr 14): GAIA (local agent execution), Kontext CLI (JIT credential broker), SnapState (checkpoint/resume), Context Surgeon (agent self-manages context), OQP (testable assertions about agent behavior). Shift from “agents that chat” to “agents that persist, authenticate, and manage their own cognition.”
Background agent swarms	Multiple small agents running continuously on tiny local models.
Two-tier plugin distribution	Codex: curated (vetted, backend-hosted) + community (non-curated). Plugin marketplace economics forming.

Watch — early signal, track for developments

Pattern/Tool	Evidence
Agent-native devtools (NEW Apr 30)	Vite DevTools v0.1.16 ships `devframe`: “Framework-neutral devtools foundation + agent-native MCP.” First major dev tooling to ship MCP as first-class feature. Combined with antfu’s Claude Opus 4.7 co-authorship pattern. Agents aren’t just using tools — tools are being built for agents.
Purpose-built coding models (NEW Apr 30)	Poolside Laguna XS.2 (33B/3B MoE, Apache 2.0, 68.2% SWE-Bench) — first open-weight model architecturally designed for agentic coding. Paired with `pool` terminal agent. Mistral Medium 3.5 (128B dense, 77.6% SWE-Bench) — merged flagship for remote agents in Vibe. The model layer is no longer general-purpose models adapted for coding — dedicated coding models are arriving.
Dependency graph as agent interface (NEW Apr 30)	aube v1.5.0 `aube query` — vlt-inspired selector-based dependency graph queries with JSON output. Package managers exposing their graph for programmatic consumption, not just human-readable output. Combined with mise `aube_args`, the dev environment stack is adapting to agents as consumers.
Post-ban community migration	ZeroClaw (Rust), NullClaw, local models. OpenClaw community adopting Kimi K2.5. Credits expire April 17 (3 days).
OWASP MCP Top 10	New compliance standard from MS Governance Toolkit. Maps agentic AI risks to MCP-specific controls. May become de facto standard.
AI workspace consolidation	Sauna, Notion AI, Glean. Crowded, no winner.
NIST AI Agent Identity standards	Comment period closed April 2. IAM frameworks for autonomous agents.
EU AI Act enforcement	August 2, 2026 — first major enforcement date. High-risk AI, GPAI, foundation model requirements.
Harness economics	OpenClaw ban proved the arbitrage model is unsustainable. Credits expired April 17 — no vendor positioned. Mutual silence held 28+ days. Codex pricing restructured: $20/$100/$200 tiers, token-based. Subsidy era ending industry-wide.
Full autonomous dev without checkpoints	The “Devin promise.” Evidence still mixed.
AI industry financial sustainability	Zitron: “subprime AI crisis” + “AI isn’t too big to fail.”
Agent-native IDEs	Is the IDE the agent, or does the agent use the IDE? Conway suggests the agent becomes the IDE.
Model-routing layers	Automatic model selection per task complexity.
Agent security monitoring	Codenotary AgentMon, Astrix Security, Black Duck Signal, Palo Alto Prisma AIRS 3.0. Security tooling wave forming.
AI agents as contributors	copilot-swe-agent contributed two OXC bug fixes (latest: node_modules config walker skip). Copilot Critic agent uses Claude to review plans. AI agents contributing to and reviewing tooling that other AI agents use.
Agent security vulnerabilities	CVE-2026-35022 (CVSS 9.8, Claude CLI/SDK command injection). Claude Code deny-rules bypass at 50+ subcommands. Security of agent tools becoming a distinct attack surface.

Key risk signal: The subsidy question

The builder community describes a genuine productivity revolution. The financial analysis shows unstable foundations:

Metric	Evidence
Anthropic compute vs revenue	$10B spent on compute, $5B revenue
OpenAI inference burn	$8.67B through Sept 2025 on $4.3B total revenue
Startup unit economics	$3-13 burned per $1 of subscription revenue
Data center gap	~5GW under construction vs. 12GW+ promised
Harness arbitrage	5x gap between subscription and API costs — now closed by ban

The synthesis: The tools and workflows are real and productive. The pricing is subsidized and temporary. Anthropic’s OpenClaw ban was the first direct vendor margin defense. The effort-level reduction (Apr 16 backlash) is the second — quieter, applied to all users, noticed by the community. At $800B+ valuation against ~$5B revenue, the gap widens. Credits expire tomorrow. The most defensible investments are in patterns (spec-driven dev, orchestration architecture, MCP) rather than specific vendor subscriptions.

Sources: Ed Zitron, “The Subprime AI Crisis Is Here” (March 31, 2026) and “AI Isn’t Too Big To Fail” (April 3, 2026)

Dominant patterns in motion

Enterprise deployment becomes regulatory (ESCALATED — April 12)

The competition shifted from agent intelligence to organizational deployability (April 11). Now the Mythos government escalation adds regulatory pressure. Security hardening moves from competitive differentiator to compliance requirement for regulated industries. Treasury/Fed treating model capability as systemic risk means enterprise deployment features become mandatory, not optional. The “five durable layers” framework (trust, context, distribution, taste, liability) explains why: the “trust” layer is now the most critical — regulatory pressure drives it.

The portability sprint (April 8, continuing)

Everyone is decoupling agents from their native clouds. Copilot CLI: BYOK + Ollama + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. Codex: WebRTC transport. The platforms are betting that lock-in loses. The most portable agent wins, not the most powerful.

The platform ships (updated April 11)

Codex v0.119.0 and v0.120.0 ended the alpha marathon. 33 alphas → two stables in 24 hours. The platform is real: MCP Apps, WebRTC realtime v2, 8+ extracted crates, remote exec-server, path-based multi-agent addressing, background agent streaming. Gemini’s GCP backend + Chapters + UCM is the same pattern. The CLI is no longer the product — it’s the thin interface to a platform.

Governance ships at platform speed

Microsoft’s governance toolkit gained OWASP MCP Top 10, SOC 2 mapping, and tool injection scanning in the same 48 hours that Codex shipped data residency and approval workflows. Governing as you ship, not after. The cross-vendor play: own the governance layer, influence every platform that needs compliance.

Spec -> Plan -> Tasks -> Code

The dominant new methodology. Write a specification -> agent decomposes into plan -> breaks into tasks -> generates implementation. Review at spec level, not code level.

Parallel agent worktrees

Infrastructure primitive. Every major tool and community wrappers ship this. Gemini adding worktree support in v0.37.0. Adopted.

The harness as key abstraction

The orchestration layer wrapping the LLM is where real engineering investment lies. Three-agent harness (planner/generator/evaluator) turns $9 broken output into $200 polished product. But: who controls the harness? Anthropic says they do. Community disagrees. Codex building two-tier plugin system may offer a middle path.

Prompting has fractured

Four distinct skills: Specification Engineering, Intent Framework Building, Evaluation Harness Design, Constraint Architecture. The “35-Minute Wall” is where 2025-era prompting collapses.

Path-based agent addressing (NEW)

Codex dropped agent IDs from spawn v2 in favor of path-based addressing. The agent tree is the address space. Fire-and-forget messaging reduces coupling. /feedback cascade enables hierarchical feedback propagation. This is a clean multi-agent communication model worth watching.

JiT testing over test suites

Meta (Feb 2026): LLM generates ephemeral tests per-change. Traditional testing cannot keep pace with agentic velocity.

Sources

Source	URL	Focus
Nate’s Newsletter	natesnewsletter.substack.com	AI practitioner strategy, MCP, workflow optimization
Where’s Your Ed At	wheresyoured.at	AI financial sustainability critique
Anthropic 2026 Agentic Coding Trends	resources.anthropic.com	Harness patterns, industry data
Anthropic Harness Design Blog	anthropic.com/engineering	Three-agent harness architecture
Meta JiTTests	engineering.fb.com	Testing paradigm shift
GitHub Spec Kit	github.com/github/spec-kit	Spec-driven scaffolding
GitHub Copilot SDK	github.com/github/copilot-sdk	Composable agent runtime
AWS Kiro	kiro.dev	Spec-driven agentic IDE
Google ADK	google.github.io/adk-docs	Model-agnostic agent framework
Microsoft Agent Governance Toolkit	github.com/microsoft/agent-governance-toolkit	Cross-vendor agent governance
Google Scion	github.com/GoogleCloudPlatform/scion	Cross-vendor agent orchestration
SurePath AI	surepath.ai	MCP-specific runtime governance
Pinterest MCP (InfoQ)	infoq.com/news/2026/04/pinterest-mcp-ecosystem	Enterprise MCP case study
SDD Framework Map	Medium (30+ frameworks)	Landscape map
NIST AI Agent Standards	csrc.nist.gov	Regulatory direction

← all landscape docs