Agentic engineering patterns

Living document. Rewritten as the field evolves. Last updated: 2026-04-12.

Technology radar

Adopt — proven, use now

Pattern/Tool	Evidence
MCP as standard protocol	Every major agent supports it. Universal adapter. Codex MCP Apps P1+P2. Pinterest: 66K invocations/month in production. 8,600+ servers. SurePath AI shipping MCP-specific governance.
Spec-driven development workflow	GitHub Spec Kit (84.7K stars), AWS Kiro, 30+ frameworks mapped. Delta Airlines: 1,948% growth in AI tool adoption using specs.
Plan-before-act	All four major CLI agents have it. Table stakes. Differentiation moved to multi-agent orchestration.
Sandbox-first execution	Universal across CLI agents. Gemini adding dynamic expansion (Windows/Linux). Codex adding deny-list mode alongside allow-list.
Git worktree-based parallel agent execution	Cursor (8 parallel), Claude Code (16+), Windsurf (5), Grok Build (8), OMX wrapping Codex. Gemini v0.37.0 preview adds worktree support.

Trial — working in production, still evolving

Pattern/Tool	Evidence
Platform-level agent orchestration	Codex v0.119.0 ships the platform: MCP Apps, WebRTC realtime, 8+ extracted crates, remote exec-server. Gemini has GCP backend. Scion cross-vendor. Codex v0.120.0 adds background agent streaming.
Enterprise deployment as competitive axis	Every agent shipped enterprise features Apr 8-11: Claude Code (Vertex wizard, Perforce, CA trust, team onboarding), Codex (residency, approval workflows), OpenCode (OAuth MCP, fast mode multi-model). Maps to Nate’s “five durable layers” thesis.
Agent governance tooling	Microsoft Governance Toolkit: OWASP MCP Top 10, SOC 2 mapping, tool injection scanning. SurePath AI: MCP-specific runtime policy engine. Pinterest at 66K/month proves the governance need.
Agent portability / BYOK	Copilot CLI: BYOK + Ollama/vLLM + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. The portability sprint.
Composable agent SDKs	Copilot SDK v0.2.1: cross-language commands + UI elicitation (JS/TS, Python, Go, .NET). BYOK, W3C tracing.
GitHub Spec Kit	Open-source spec-driven scaffolding, 84.7K stars, supports 14+ agent platforms.
AWS Kiro	Spec-driven agentic IDE on Bedrock (Claude Sonnet 4.0/3.7). GovCloud available.
Agentic harness engineering	Anthropic: “2026 is the year of harnesses.” Same model scores 17 problems apart in different agents. Claude Code’s 512K+ lines prove it.
Heterogeneous model routing	Frontier for reasoning, mid-tier for standard tasks, small models for high-frequency execution. Gemini adding dynamic routing for 3.1 Pro/Flash Lite.
Hook-based automation	Claude Code’s PreToolUse/PostToolUse/Stop with conditional filtering, defer/resume. Channels and Conway may supplement/replace.
Human-at-checkpoints	Agents build full systems autonomously, pausing only for strategic review. Anthropic’s three-agent harness: planner/generator/evaluator.
Path-based multi-agent addressing	Codex spawn v2 dropped agent IDs for path-based addressing (`/root/agent_a`). Agent tree IS the address space. Fire-and-forget messaging + feedback cascade.

Assess — investigate, understand implications

Pattern/Tool	Evidence
Frontier models as systemic risk	Anthropic’s Mythos (93.9% SWE-bench, autonomous zero-day discovery) triggered Treasury/Fed emergency meeting with bank CEOs (Apr 8). Model capability now treated as financial-sector systemic risk. New deployment pattern: directed use-case access, not open API. Security hardening becomes regulatory, not optional.
Multi-agent orchestration in the model	Meta Muse Spark: multi-agent orchestration built into the model itself, not the harness. “Contemplating mode” runs a squad of agents in parallel. Agents-in-the-model vs agents-around-the-model.
Open-weight contraction	Meta went proprietary with Muse Spark after Llama 1-4 open-weight. Open-weight now depends on Google (Gemma), Alibaba (Qwen), Zhipu (GLM), and community. Llama’s future unclear.
Self-improving review agents	Cursor Bugbot learns from PR feedback, applies learned rules to future reviews. MCP tools for context. 78% resolution rate. Early signal of agents improving from task-specific data.
Session quality as primary battleground	Gemini: Unified Context Mgmt + Chapters. Claude Code: 30+ session fixes. Zed: thinking display + streaming. Cursor: multi-agent routing. The session, not the response, is the unit of quality.
Agent execution runtimes	Anthropic Managed Agents (April 8-9): YAML definitions, sandboxed execution, persistent sessions, $0.08/session-hour. Beta with Notion, Asana, Rakuten, Sentry. Conway CNW ZIP may be the extension format. Codex: remote exec-server. Gemini: GCP backend. Model providers becoming execution platforms.
Persistent agent platforms	Conway CNW ZIP details via Nate’s analysis: standalone workspace, webhook activation, browser control, proprietary extension format. Channels (shipped). Codex: remote control + WebRTC. Gemini: GCP backend + Interactions API. Google Scion as external orchestration.
MCP Apps ecosystem	Codex MCP Apps P1+P2 (meta to tool call results). 8,600+ servers. Pinterest 66K/month in production. SurePath AI governance. MCP Server Cards (.well-known) proposed.
Agents as supply chain participants	Dependabot-to-agent assignment for security remediation. Copilot Critic agent (uses Claude to review plans). OXC copilot-swe-agent contributing fixes. Agents managing security, not just generating code.
Cross-vendor agent orchestration	Google Scion: open-source, runs Claude+Codex+Gemini in parallel with container isolation. Copilot Studio: multi-model broker (5 models). OMX: community Codex orchestration.
Multi-model broker platforms	Copilot Studio GA with Claude Opus 4.6, Sonnet 4.5, Grok 4.1, GPT-5.3/5.4. Microsoft positioning as model-agnostic orchestrator.
Vendor surface control	Anthropic claiming all interaction surfaces. OpenClaw ban as enforcement. But 3 days of silence since — unclear if strategy is expanding or pausing.
Meta JiT Testing	LLM generates tests per-PR by analyzing diff. No persistent test suite. 70% reduction in human review load.
Agent-to-Agent protocols + payments	A2A v1.0 (April 9): first stable spec. Multi-protocol, enterprise multi-tenancy, 5 production SDKs (Python, JS, Java, Go, .NET). 150+ orgs, 22K+ stars. AP2: 60+ orgs. Visa ICC: neutral payment layer for 4 protocols. McKinsey: $5T agent-driven sales by 2030.
Agent supply chain attacks	OpenClaw ClawHavoc: 824+ malicious skills (growing), 135K exposed instances. CVE-2026-35669 (CVSS 8.8, Apr 10) privilege escalation. First attack targeting agent execution patterns specifically. AMOS macOS stealer via agentic workflows. Claw Code (72K stars): clean-room Claude Code clone from source map leak — proves agent architectures are now replicable open-source targets.
oh-my-codex (OMX)	2.8K stars overnight. Community multi-agent orchestration for Codex CLI.
Mamba-Transformer hybrids for agents	Nemotron 3 Nano claims 5x throughput. Linear context scaling. If verified, changes local agent architecture.
KV cache compression for local inference	Google TurboQuant: 6x KV cache compression, zero accuracy loss, no retraining. ICLR 2026. llama.cpp integration exists (`turboquant_plus`, Metal support). Changes local inference economics: existing GPUs serve 6x more context. Combined with Copilot BYOK, strengthens case for local-first agent architecture.
Devin / Cognition Labs	$10.2B valuation, $150M ARR (with Windsurf). Real capability but unclear ROI.
sauna.ai (Wordware)	Largest YC seed ($30M), Instacart/Runway customers. Nate’s test: scored 1/4 on knowledge-work tasks.
Agent memory systems	Gemini Chapters, project-level memory, Nate’s “Open Brain” PostgreSQL+MCP pattern. Context persistence is the bottleneck.
Background agent swarms	Multiple small agents running continuously on tiny local models.
Two-tier plugin distribution	Codex: curated (vetted, backend-hosted) + community (non-curated). Plugin marketplace economics forming.

Watch — early signal, track for developments

Pattern/Tool	Evidence
Post-ban community migration	ZeroClaw (Rust), NullClaw, local models. OpenClaw community adopting Kimi K2.5. Credits expire April 17 (6 days).
OWASP MCP Top 10	New compliance standard from MS Governance Toolkit. Maps agentic AI risks to MCP-specific controls. May become de facto standard.
AI workspace consolidation	Sauna, Notion AI, Glean. Crowded, no winner.
NIST AI Agent Identity standards	Comment period closed April 2. IAM frameworks for autonomous agents.
EU AI Act enforcement	August 2, 2026 — first major enforcement date. High-risk AI, GPAI, foundation model requirements.
Harness economics	OpenClaw ban proved the arbitrage model is unsustainable. Credits expire April 17 (5 days). Codex pricing restructured: $20/$100/$200 tiers, token-based.
Full autonomous dev without checkpoints	The “Devin promise.” Evidence still mixed.
AI industry financial sustainability	Zitron: “subprime AI crisis” + “AI isn’t too big to fail.”
Agent-native IDEs	Is the IDE the agent, or does the agent use the IDE? Conway suggests the agent becomes the IDE.
Model-routing layers	Automatic model selection per task complexity.
Agent security monitoring	Codenotary AgentMon, Astrix Security, Black Duck Signal, Palo Alto Prisma AIRS 3.0. Security tooling wave forming.
AI agents as contributors	copilot-swe-agent contributed two OXC bug fixes (latest: node_modules config walker skip). Copilot Critic agent uses Claude to review plans. AI agents contributing to and reviewing tooling that other AI agents use.
Agent security vulnerabilities	CVE-2026-35022 (CVSS 9.8, Claude CLI/SDK command injection). Claude Code deny-rules bypass at 50+ subcommands. Security of agent tools becoming a distinct attack surface.

Key risk signal: The subsidy question

The builder community describes a genuine productivity revolution. The financial analysis shows unstable foundations:

Metric	Evidence
Anthropic compute vs revenue	$10B spent on compute, $5B revenue
OpenAI inference burn	$8.67B through Sept 2025 on $4.3B total revenue
Startup unit economics	$3-13 burned per $1 of subscription revenue
Data center gap	~5GW under construction vs. 12GW+ promised
Harness arbitrage	5x gap between subscription and API costs — now closed by ban

The synthesis: The tools and workflows are real and productive. The pricing is subsidized and temporary. Anthropic’s OpenClaw ban is the first direct vendor margin defense. The most defensible investments are in patterns (spec-driven dev, orchestration architecture, MCP) rather than specific vendor subscriptions. The response layer forming this week validates this: vendor lock-in gets routed around within 72 hours.

Sources: Ed Zitron, “The Subprime AI Crisis Is Here” (March 31, 2026) and “AI Isn’t Too Big To Fail” (April 3, 2026)

Dominant patterns in motion

Enterprise deployment becomes regulatory (ESCALATED — April 12)

The competition shifted from agent intelligence to organizational deployability (April 11). Now the Mythos government escalation adds regulatory pressure. Security hardening moves from competitive differentiator to compliance requirement for regulated industries. Treasury/Fed treating model capability as systemic risk means enterprise deployment features become mandatory, not optional. The “five durable layers” framework (trust, context, distribution, taste, liability) explains why: the “trust” layer is now the most critical — regulatory pressure drives it.

The portability sprint (April 8, continuing)

Everyone is decoupling agents from their native clouds. Copilot CLI: BYOK + Ollama + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. Codex: WebRTC transport. The platforms are betting that lock-in loses. The most portable agent wins, not the most powerful.

The platform ships (updated April 11)

Codex v0.119.0 and v0.120.0 ended the alpha marathon. 33 alphas → two stables in 24 hours. The platform is real: MCP Apps, WebRTC realtime v2, 8+ extracted crates, remote exec-server, path-based multi-agent addressing, background agent streaming. Gemini’s GCP backend + Chapters + UCM is the same pattern. The CLI is no longer the product — it’s the thin interface to a platform.

Governance ships at platform speed

Microsoft’s governance toolkit gained OWASP MCP Top 10, SOC 2 mapping, and tool injection scanning in the same 48 hours that Codex shipped data residency and approval workflows. Governing as you ship, not after. The cross-vendor play: own the governance layer, influence every platform that needs compliance.

Spec -> Plan -> Tasks -> Code

The dominant new methodology. Write a specification -> agent decomposes into plan -> breaks into tasks -> generates implementation. Review at spec level, not code level.

Parallel agent worktrees

Infrastructure primitive. Every major tool and community wrappers ship this. Gemini adding worktree support in v0.37.0. Adopted.

The harness as key abstraction

The orchestration layer wrapping the LLM is where real engineering investment lies. Three-agent harness (planner/generator/evaluator) turns $9 broken output into $200 polished product. But: who controls the harness? Anthropic says they do. Community disagrees. Codex building two-tier plugin system may offer a middle path.

Prompting has fractured

Four distinct skills: Specification Engineering, Intent Framework Building, Evaluation Harness Design, Constraint Architecture. The “35-Minute Wall” is where 2025-era prompting collapses.

Path-based agent addressing (NEW)

Codex dropped agent IDs from spawn v2 in favor of path-based addressing. The agent tree is the address space. Fire-and-forget messaging reduces coupling. /feedback cascade enables hierarchical feedback propagation. This is a clean multi-agent communication model worth watching.

JiT testing over test suites

Meta (Feb 2026): LLM generates ephemeral tests per-change. Traditional testing cannot keep pace with agentic velocity.

Sources

Source	URL	Focus
Nate’s Newsletter	natesnewsletter.substack.com	AI practitioner strategy, MCP, workflow optimization
Where’s Your Ed At	wheresyoured.at	AI financial sustainability critique
Anthropic 2026 Agentic Coding Trends	resources.anthropic.com	Harness patterns, industry data
Anthropic Harness Design Blog	anthropic.com/engineering	Three-agent harness architecture
Meta JiTTests	engineering.fb.com	Testing paradigm shift
GitHub Spec Kit	github.com/github/spec-kit	Spec-driven scaffolding
GitHub Copilot SDK	github.com/github/copilot-sdk	Composable agent runtime
AWS Kiro	kiro.dev	Spec-driven agentic IDE
Google ADK	google.github.io/adk-docs	Model-agnostic agent framework
Microsoft Agent Governance Toolkit	github.com/microsoft/agent-governance-toolkit	Cross-vendor agent governance
Google Scion	github.com/GoogleCloudPlatform/scion	Cross-vendor agent orchestration
SurePath AI	surepath.ai	MCP-specific runtime governance
Pinterest MCP (InfoQ)	infoq.com/news/2026/04/pinterest-mcp-ecosystem	Enterprise MCP case study
SDD Framework Map	Medium (30+ frameworks)	Landscape map
NIST AI Agent Standards	csrc.nist.gov	Regulatory direction