2025-08-04 · OpenAI

What we’re optimizing ChatGPT for

models

What we’re optimizing ChatGPT for

Source: OpenAI Date: 2025-08-04 URL: https://openai.com/index/optimizing-chatgpt

Summary

OpenAI published a statement on what they are explicitly optimizing ChatGPT for — covering the values, behaviors, and product goals that drive ChatGPT’s development. The post addressed questions about helpfulness, honesty, harm avoidance, and the balance between user preferences and broader societal considerations.

Implications

Product/alignment thread. “What we’re optimizing for” is a product values document responding to accumulated criticism about ChatGPT’s behavior — sycophancy, over-refusals, inconsistent values across contexts. Publishing explicit optimization targets is both a transparency move and a commitment device: it creates a standard against which future model behavior can be measured. The August 2025 timing follows the sycophancy incident from April 2025 and comes just before GPT-5’s capabilities were being rapidly deployed across products. For alignment researchers, the document is interesting for what it reveals about the objective function behind ChatGPT’s RLHF training — where product teams say they want the model to be versus where training pressure actually lands. The gap between stated and revealed preferences in these documents is itself a research topic.

← all signals