2025-09-05 · Nate's Newsletter

Make ChatGPT-5 Write Like a Real Human: An Easy Jailbreak Kit for a Stubborn Model

models

read at source ↗ natesnewsletter.substack.com

Make ChatGPT-5 Write Like a Real Human: An Easy Jailbreak Kit for a Stubborn Model

Source: Nate’s Newsletter Date: 2025-09-05 URL: https://natesnewsletter.substack.com/p/make-chatgpt-5-write-like-a-real

Summary

Nate argues ChatGPT-5 is optimized for AI-to-AI quality ratings rather than human readability — it defaults to corporate jargon and complexity because AI evaluators rate “sophisticated vocabulary and complex structures” as high-quality. The practical fix is prompt techniques that override this training behavior, based on patterns from thousands of conversations.

Implications

Vendor positioning thread. AI-to-AI optimization creating misalignment with human readability is a structural product quality problem at OpenAI, not just a prompting quirk. If the research Nate cites holds, it means ChatGPT-5’s default outputs are calibrated for the wrong audience — and OpenAI may not have strong incentive to fix it since human users can work around it.

Agent product strategy thread. Agents that generate outputs evaluated by other AI systems (pipelines, multi-agent chains) face the same optimization trap — they’ll produce “impressive to AI” outputs that fail human review. Designing evaluation criteria explicitly for human readability is a necessary counterweight.

Watch: Whether OpenAI addresses the AI-to-AI optimization problem in subsequent fine-tuning rounds, or whether prompt engineering workarounds remain the user-side solution indefinitely.

← all signals