Gemini CLI v0.41.0-preview.0 — real-time voice mode + Gemma 4 support
Gemini CLI v0.41.0-preview.0 — real-time voice mode + Gemma 4 support
Summary
Gemini CLI v0.41.0-preview.0 (April 30) ships the first real-time voice mode in any CLI coding agent, with cloud and local backends. Also: experimental Gemma 4 model support (local models in the agent), new ContextManager and AgentChatHistory wiring, persistent auto-memory scratchpad for skill extraction, workspace trust enforcement in headless mode, secured .env loading, async boot optimization, output redirection, and manual session UUIDs.
Implications
Voice mode is a paradigm shift for CLI agents — all prior interaction has been text-in, text-out. A voice backend changes who can use a coding agent and how. The local backend option means voice works offline, connecting to the local-first thesis. Gemma 4 experimental support signals Google embedding its own open-weight models directly into its agent — the first CLI agent with built-in local model support rather than relying on external runtimes like Ollama.
- Agent layer thread: voice adds a new interaction modality, potentially expanding the user base beyond keyboard-native developers
- Context management thread: new ContextManager + AgentChatHistory is the next iteration of Gemini’s already-leading context architecture (UCM + Chapters + ContextCompressionService)
- Local model thread: Gemma 4 in Gemini CLI = Google’s agent running Google’s model locally. Vertical integration at the agent level
- Security thread: workspace trust in headless mode addresses the CI/CD agent deployment surface