Designing AI agents to resist prompt injection
read at source ↗ openai.com
Designing AI agents to resist prompt injection
Source: OpenAI Date: 2026-03-11 URL: https://openai.com/index/designing-agents-to-resist-prompt-injection
Summary
OpenAI’s research and guidance post on making AI agents resistant to prompt injection attacks — adversarial instructions embedded in content the agent processes (emails, web pages, documents) that attempt to hijack the agent’s behavior. Covers detection approaches, architectural defenses, and evaluation methodologies for injection resistance.
Implications
Agentic security thread. Prompt injection is the top security concern for deployed AI agents — any agent that reads external content is potentially vulnerable to content that contains adversarial instructions. OpenAI publishing design guidance for injection resistance signals that they’ve done enough internal evaluation to have concrete recommendations. The March 2026 date aligns with agents going into wider production deployment; the guidance is timely for developers building on the Responses API. This is the security equivalent of OWASP for LLM applications.