Gemini 3.1 Flash Live

Google’s streaming speech-to-speech voice model that replaces the classic STT to LLM to TTS pipeline with a single native speech model. Marketed as Google’s biggest voice upgrade, with lower latency, noise-robust listening, interruption handling and multimodal vision input.

Key facts

  • Native speech-to-speech: no intermediate text transcription step, so prosody, sarcasm and stress are preserved into the reasoning layer
  • Beats Gemini 2.5 Flash by ~19% on multi-step function calling and outperforms competitor models on the Audio Multi-Challenge benchmark
  • Supports multimodal vision — the agent can watch a webcam or share-screen feed and reason about what it sees
  • Over 70 supported languages, enabling real-time translation use cases
  • Free tier available in Google AI Studio with no API key; paid tier removes the ‘training on your data’ clause and raises rate limits
  • Pricing: roughly 14 cents per 10-minute call on the paid tier
  • Current limitation: stops speaking during function calls — cannot narrate over tool execution the way a prompted Vapi agent can
  • Deployment beyond Google AI Studio requires managing persistent websocket connections — less plug-and-play than ElevenLabs or Vapi for web embedding

Source references

  • [src-007] Nate Herk cluster — Nate Herk — Voice AI agents cluster (4 videos)

– Videos referenced: Qt3zMBH-FNg

Robin Cartier perspective

This page is part of Robin Cartier's working AI knowledge graph: a practical research layer for production AI, recommendation systems, experimentation, GEO, and agentic web readiness.

The useful next step is to connect this concept back to applied product leadership and operating models.

Recommended next

Keep reading from this thread

From 494 indexed pages and articles.

  1. Wiki concept Live Voice Models The emerging category of streaming audio models that replace or compress the classic STT to LLM to TTS voice-agent pipeline. Related by live
  2. Wiki concept Google AI Studio Google's web-based playground for testing Gemini models including Gemini 3.1 Flash Live and Gemini Robotics-ER 1.6. Related by gemini
  3. Insight AI Beyond POCs How enterprise AI moves beyond proofs of concept through ownership, governance, measurement, adoption, and production operating models Readers have engaged with this next