Agentic engineering patterns

Living document. Rewritten as the field evolves. Last updated: 2026-04-12.

Technology radar

Adopt — proven, use now

Pattern/ToolEvidence
MCP as standard protocolEvery major agent supports it. Universal adapter. Codex MCP Apps P1+P2. Pinterest: 66K invocations/month in production. 8,600+ servers. SurePath AI shipping MCP-specific governance.
Spec-driven development workflowGitHub Spec Kit (84.7K stars), AWS Kiro, 30+ frameworks mapped. Delta Airlines: 1,948% growth in AI tool adoption using specs.
Plan-before-actAll four major CLI agents have it. Table stakes. Differentiation moved to multi-agent orchestration.
Sandbox-first executionUniversal across CLI agents. Gemini adding dynamic expansion (Windows/Linux). Codex adding deny-list mode alongside allow-list.
Git worktree-based parallel agent executionCursor (8 parallel), Claude Code (16+), Windsurf (5), Grok Build (8), OMX wrapping Codex. Gemini v0.37.0 preview adds worktree support.

Trial — working in production, still evolving

Pattern/ToolEvidence
Platform-level agent orchestrationCodex v0.119.0 ships the platform: MCP Apps, WebRTC realtime, 8+ extracted crates, remote exec-server. Gemini has GCP backend. Scion cross-vendor. Codex v0.120.0 adds background agent streaming.
Enterprise deployment as competitive axisEvery agent shipped enterprise features Apr 8-11: Claude Code (Vertex wizard, Perforce, CA trust, team onboarding), Codex (residency, approval workflows), OpenCode (OAuth MCP, fast mode multi-model). Maps to Nate’s “five durable layers” thesis.
Agent governance toolingMicrosoft Governance Toolkit: OWASP MCP Top 10, SOC 2 mapping, tool injection scanning. SurePath AI: MCP-specific runtime policy engine. Pinterest at 66K/month proves the governance need.
Agent portability / BYOKCopilot CLI: BYOK + Ollama/vLLM + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. The portability sprint.
Composable agent SDKsCopilot SDK v0.2.1: cross-language commands + UI elicitation (JS/TS, Python, Go, .NET). BYOK, W3C tracing.
GitHub Spec KitOpen-source spec-driven scaffolding, 84.7K stars, supports 14+ agent platforms.
AWS KiroSpec-driven agentic IDE on Bedrock (Claude Sonnet 4.0/3.7). GovCloud available.
Agentic harness engineeringAnthropic: “2026 is the year of harnesses.” Same model scores 17 problems apart in different agents. Claude Code’s 512K+ lines prove it.
Heterogeneous model routingFrontier for reasoning, mid-tier for standard tasks, small models for high-frequency execution. Gemini adding dynamic routing for 3.1 Pro/Flash Lite.
Hook-based automationClaude Code’s PreToolUse/PostToolUse/Stop with conditional filtering, defer/resume. Channels and Conway may supplement/replace.
Human-at-checkpointsAgents build full systems autonomously, pausing only for strategic review. Anthropic’s three-agent harness: planner/generator/evaluator.
Path-based multi-agent addressingCodex spawn v2 dropped agent IDs for path-based addressing (/root/agent_a). Agent tree IS the address space. Fire-and-forget messaging + feedback cascade.

Assess — investigate, understand implications

Pattern/ToolEvidence
Frontier models as systemic riskAnthropic’s Mythos (93.9% SWE-bench, autonomous zero-day discovery) triggered Treasury/Fed emergency meeting with bank CEOs (Apr 8). Model capability now treated as financial-sector systemic risk. New deployment pattern: directed use-case access, not open API. Security hardening becomes regulatory, not optional.
Multi-agent orchestration in the modelMeta Muse Spark: multi-agent orchestration built into the model itself, not the harness. “Contemplating mode” runs a squad of agents in parallel. Agents-in-the-model vs agents-around-the-model.
Open-weight contractionMeta went proprietary with Muse Spark after Llama 1-4 open-weight. Open-weight now depends on Google (Gemma), Alibaba (Qwen), Zhipu (GLM), and community. Llama’s future unclear.
Self-improving review agentsCursor Bugbot learns from PR feedback, applies learned rules to future reviews. MCP tools for context. 78% resolution rate. Early signal of agents improving from task-specific data.
Session quality as primary battlegroundGemini: Unified Context Mgmt + Chapters. Claude Code: 30+ session fixes. Zed: thinking display + streaming. Cursor: multi-agent routing. The session, not the response, is the unit of quality.
Agent execution runtimesAnthropic Managed Agents (April 8-9): YAML definitions, sandboxed execution, persistent sessions, $0.08/session-hour. Beta with Notion, Asana, Rakuten, Sentry. Conway CNW ZIP may be the extension format. Codex: remote exec-server. Gemini: GCP backend. Model providers becoming execution platforms.
Persistent agent platformsConway CNW ZIP details via Nate’s analysis: standalone workspace, webhook activation, browser control, proprietary extension format. Channels (shipped). Codex: remote control + WebRTC. Gemini: GCP backend + Interactions API. Google Scion as external orchestration.
MCP Apps ecosystemCodex MCP Apps P1+P2 (meta to tool call results). 8,600+ servers. Pinterest 66K/month in production. SurePath AI governance. MCP Server Cards (.well-known) proposed.
Agents as supply chain participantsDependabot-to-agent assignment for security remediation. Copilot Critic agent (uses Claude to review plans). OXC copilot-swe-agent contributing fixes. Agents managing security, not just generating code.
Cross-vendor agent orchestrationGoogle Scion: open-source, runs Claude+Codex+Gemini in parallel with container isolation. Copilot Studio: multi-model broker (5 models). OMX: community Codex orchestration.
Multi-model broker platformsCopilot Studio GA with Claude Opus 4.6, Sonnet 4.5, Grok 4.1, GPT-5.3/5.4. Microsoft positioning as model-agnostic orchestrator.
Vendor surface controlAnthropic claiming all interaction surfaces. OpenClaw ban as enforcement. But 3 days of silence since — unclear if strategy is expanding or pausing.
Meta JiT TestingLLM generates tests per-PR by analyzing diff. No persistent test suite. 70% reduction in human review load.
Agent-to-Agent protocols + paymentsA2A v1.0 (April 9): first stable spec. Multi-protocol, enterprise multi-tenancy, 5 production SDKs (Python, JS, Java, Go, .NET). 150+ orgs, 22K+ stars. AP2: 60+ orgs. Visa ICC: neutral payment layer for 4 protocols. McKinsey: $5T agent-driven sales by 2030.
Agent supply chain attacksOpenClaw ClawHavoc: 824+ malicious skills (growing), 135K exposed instances. CVE-2026-35669 (CVSS 8.8, Apr 10) privilege escalation. First attack targeting agent execution patterns specifically. AMOS macOS stealer via agentic workflows. Claw Code (72K stars): clean-room Claude Code clone from source map leak — proves agent architectures are now replicable open-source targets.
oh-my-codex (OMX)2.8K stars overnight. Community multi-agent orchestration for Codex CLI.
Mamba-Transformer hybrids for agentsNemotron 3 Nano claims 5x throughput. Linear context scaling. If verified, changes local agent architecture.
KV cache compression for local inferenceGoogle TurboQuant: 6x KV cache compression, zero accuracy loss, no retraining. ICLR 2026. llama.cpp integration exists (turboquant_plus, Metal support). Changes local inference economics: existing GPUs serve 6x more context. Combined with Copilot BYOK, strengthens case for local-first agent architecture.
Devin / Cognition Labs$10.2B valuation, $150M ARR (with Windsurf). Real capability but unclear ROI.
sauna.ai (Wordware)Largest YC seed ($30M), Instacart/Runway customers. Nate’s test: scored 1/4 on knowledge-work tasks.
Agent memory systemsGemini Chapters, project-level memory, Nate’s “Open Brain” PostgreSQL+MCP pattern. Context persistence is the bottleneck.
Background agent swarmsMultiple small agents running continuously on tiny local models.
Two-tier plugin distributionCodex: curated (vetted, backend-hosted) + community (non-curated). Plugin marketplace economics forming.

Watch — early signal, track for developments

Pattern/ToolEvidence
Post-ban community migrationZeroClaw (Rust), NullClaw, local models. OpenClaw community adopting Kimi K2.5. Credits expire April 17 (6 days).
OWASP MCP Top 10New compliance standard from MS Governance Toolkit. Maps agentic AI risks to MCP-specific controls. May become de facto standard.
AI workspace consolidationSauna, Notion AI, Glean. Crowded, no winner.
NIST AI Agent Identity standardsComment period closed April 2. IAM frameworks for autonomous agents.
EU AI Act enforcementAugust 2, 2026 — first major enforcement date. High-risk AI, GPAI, foundation model requirements.
Harness economicsOpenClaw ban proved the arbitrage model is unsustainable. Credits expire April 17 (5 days). Codex pricing restructured: $20/$100/$200 tiers, token-based.
Full autonomous dev without checkpointsThe “Devin promise.” Evidence still mixed.
AI industry financial sustainabilityZitron: “subprime AI crisis” + “AI isn’t too big to fail.”
Agent-native IDEsIs the IDE the agent, or does the agent use the IDE? Conway suggests the agent becomes the IDE.
Model-routing layersAutomatic model selection per task complexity.
Agent security monitoringCodenotary AgentMon, Astrix Security, Black Duck Signal, Palo Alto Prisma AIRS 3.0. Security tooling wave forming.
AI agents as contributorscopilot-swe-agent contributed two OXC bug fixes (latest: node_modules config walker skip). Copilot Critic agent uses Claude to review plans. AI agents contributing to and reviewing tooling that other AI agents use.
Agent security vulnerabilitiesCVE-2026-35022 (CVSS 9.8, Claude CLI/SDK command injection). Claude Code deny-rules bypass at 50+ subcommands. Security of agent tools becoming a distinct attack surface.

Key risk signal: The subsidy question

The builder community describes a genuine productivity revolution. The financial analysis shows unstable foundations:

MetricEvidence
Anthropic compute vs revenue$10B spent on compute, $5B revenue
OpenAI inference burn$8.67B through Sept 2025 on $4.3B total revenue
Startup unit economics$3-13 burned per $1 of subscription revenue
Data center gap~5GW under construction vs. 12GW+ promised
Harness arbitrage5x gap between subscription and API costs — now closed by ban

The synthesis: The tools and workflows are real and productive. The pricing is subsidized and temporary. Anthropic’s OpenClaw ban is the first direct vendor margin defense. The most defensible investments are in patterns (spec-driven dev, orchestration architecture, MCP) rather than specific vendor subscriptions. The response layer forming this week validates this: vendor lock-in gets routed around within 72 hours.

Sources: Ed Zitron, “The Subprime AI Crisis Is Here” (March 31, 2026) and “AI Isn’t Too Big To Fail” (April 3, 2026)

Dominant patterns in motion

Enterprise deployment becomes regulatory (ESCALATED — April 12)

The competition shifted from agent intelligence to organizational deployability (April 11). Now the Mythos government escalation adds regulatory pressure. Security hardening moves from competitive differentiator to compliance requirement for regulated industries. Treasury/Fed treating model capability as systemic risk means enterprise deployment features become mandatory, not optional. The “five durable layers” framework (trust, context, distribution, taste, liability) explains why: the “trust” layer is now the most critical — regulatory pressure drives it.

The portability sprint (April 8, continuing)

Everyone is decoupling agents from their native clouds. Copilot CLI: BYOK + Ollama + air-gapped. Scion: vendor-agnostic orchestration. Dependabot: multi-vendor agent assignment. Codex: WebRTC transport. The platforms are betting that lock-in loses. The most portable agent wins, not the most powerful.

The platform ships (updated April 11)

Codex v0.119.0 and v0.120.0 ended the alpha marathon. 33 alphas → two stables in 24 hours. The platform is real: MCP Apps, WebRTC realtime v2, 8+ extracted crates, remote exec-server, path-based multi-agent addressing, background agent streaming. Gemini’s GCP backend + Chapters + UCM is the same pattern. The CLI is no longer the product — it’s the thin interface to a platform.

Governance ships at platform speed

Microsoft’s governance toolkit gained OWASP MCP Top 10, SOC 2 mapping, and tool injection scanning in the same 48 hours that Codex shipped data residency and approval workflows. Governing as you ship, not after. The cross-vendor play: own the governance layer, influence every platform that needs compliance.

Spec -> Plan -> Tasks -> Code

The dominant new methodology. Write a specification -> agent decomposes into plan -> breaks into tasks -> generates implementation. Review at spec level, not code level.

Parallel agent worktrees

Infrastructure primitive. Every major tool and community wrappers ship this. Gemini adding worktree support in v0.37.0. Adopted.

The harness as key abstraction

The orchestration layer wrapping the LLM is where real engineering investment lies. Three-agent harness (planner/generator/evaluator) turns $9 broken output into $200 polished product. But: who controls the harness? Anthropic says they do. Community disagrees. Codex building two-tier plugin system may offer a middle path.

Prompting has fractured

Four distinct skills: Specification Engineering, Intent Framework Building, Evaluation Harness Design, Constraint Architecture. The “35-Minute Wall” is where 2025-era prompting collapses.

Path-based agent addressing (NEW)

Codex dropped agent IDs from spawn v2 in favor of path-based addressing. The agent tree is the address space. Fire-and-forget messaging reduces coupling. /feedback cascade enables hierarchical feedback propagation. This is a clean multi-agent communication model worth watching.

JiT testing over test suites

Meta (Feb 2026): LLM generates ephemeral tests per-change. Traditional testing cannot keep pace with agentic velocity.

Sources

SourceURLFocus
Nate’s Newsletternatesnewsletter.substack.comAI practitioner strategy, MCP, workflow optimization
Where’s Your Ed Atwheresyoured.atAI financial sustainability critique
Anthropic 2026 Agentic Coding Trendsresources.anthropic.comHarness patterns, industry data
Anthropic Harness Design Bloganthropic.com/engineeringThree-agent harness architecture
Meta JiTTestsengineering.fb.comTesting paradigm shift
GitHub Spec Kitgithub.com/github/spec-kitSpec-driven scaffolding
GitHub Copilot SDKgithub.com/github/copilot-sdkComposable agent runtime
AWS Kirokiro.devSpec-driven agentic IDE
Google ADKgoogle.github.io/adk-docsModel-agnostic agent framework
Microsoft Agent Governance Toolkitgithub.com/microsoft/agent-governance-toolkitCross-vendor agent governance
Google Sciongithub.com/GoogleCloudPlatform/scionCross-vendor agent orchestration
SurePath AIsurepath.aiMCP-specific runtime governance
Pinterest MCP (InfoQ)infoq.com/news/2026/04/pinterest-mcp-ecosystemEnterprise MCP case study
SDD Framework MapMedium (30+ frameworks)Landscape map
NIST AI Agent Standardscsrc.nist.govRegulatory direction

← all landscape docs