2024-08-08 · OpenAI

GPT-4o System Card

models

GPT-4o System Card

Source: OpenAI Date: 2024-08-08 URL: https://openai.com/index/gpt-4o-system-card

Summary

Title-only: The formal safety documentation for GPT-4o — OpenAI’s first natively multimodal model integrating text, audio, and vision in a single architecture. The system card covers capability evaluations, safety mitigations, and residual risks across modalities: voice cloning risks in the audio channel, real-person identification via vision, CBRN information elicitation across text, and cross-modal jailbreak vectors that didn’t exist in text-only models. August 2024 is shortly after GPT-4o’s May launch, before full audio mode release.

Implications

The multimodal safety complexity thread. GPT-4o’s system card is a landmark safety document because it’s the first to address risks that emerge from modality combination rather than from any single channel in isolation. A text-only model has text risks; a vision model has image risks; a model that can see, hear, and speak simultaneously opens novel attack surfaces — audio instructions hidden in ambient sound, vision inputs that bypass text-level content filters, voice outputs that can be personalized and cloned. The safety work required scales faster than linearly with modality count.

The capability-safety documentation race. GPT-4o’s system card is also a marker in OpenAI’s relationship with its own safety governance: publishing the system card is the accountability mechanism, but the card is necessarily retrospective — written after the capability exists and the mitigations are implemented. As models grow more capable faster, the gap between “this exists” and “we understand its safety profile” widens. The August 2024 GPT-4o system card is early evidence of that tension — full audio mode was withheld at launch specifically because safety evaluation wasn’t complete.

← all signals