Start building with Gemini 2.0 Flash and Flash-Lite
read at source ↗ deepmind.google
Start building with Gemini 2.0 Flash and Flash-Lite
Source: DeepMind Date: 2025-02-25 URL: https://deepmind.google/blog/start-building-with-gemini-20-flash-and-flash-lite/
Summary
Google launched Gemini 2.0 Flash and Flash-Lite to GA with Flash-Lite priced at $0.10/million input tokens — 33% cheaper than prior Flash pricing. Flash-Lite outperforms 1.5 Flash across reasoning, multimodal, math, and factuality benchmarks. Real-world results: Dawn AI cut search times from hours to under a minute and costs by 90%; Daily.co uses Flash-Lite for voicemail detection; Mosaic cut video editing from hours to seconds. Emphasis on fast TTFT for voice AI applications.
Implications
$0.10/million input tokens is the programmatic AI cost floor for 2025 Q1. At this price point, high-volume programmatic use cases — document processing, data enrichment, content moderation — become economically viable at scale that previously required specialized cheaper models. This is the price that caused the 90% Dawn AI cost reduction.
The named customer results are more credible than benchmarks. Dawn AI, Daily.co, and Mosaic are real companies reporting real production outcomes — not eval numbers on academic benchmarks. “Hours to under a minute” and “90% cost reduction” are the signals enterprise buyers need. Google is learning to lead with customer results, not just model scores.
TTFT emphasis signals voice AI as the primary Flash-Lite use case. Fast Time-to-First-Token matters for conversational voice latency in ways that don’t matter for batch processing. Positioning Flash-Lite as the voice AI model targets the ElevenLabs/OpenAI Realtime API/Azure Speech market — a rapidly growing and sticky enterprise segment.
Watch:
- Flash-Lite pricing trajectory — $0.10 launched in February 2025; by October 2025 Flash-Lite was $0.10/$0.40 with reasoning included. The price held while capability increased
- Voice AI adoption: which conversational AI platforms default to Flash-Lite for production deployments?
- Whether the 90% cost reduction pattern holds across other customers switching from 1.5 Flash