Gemini Robotics-ER
Gemini Robotics-ER is Google DeepMind’s embodied reasoning model family for robotics. Version 1.6 upgrades spatial reasoning, multi-view understanding, instrument reading, and physical safety reasoning for autonomous robots.
Key facts
- Type: Robotics / embodied reasoning model
- Version covered: Gemini Robotics-ER 1.6 [src-039]
- Publisher: Google DeepMind
- Available through: Gemini API and Google AI Studio [src-039]
- Role in robot stack: High-level reasoning model for visual/spatial understanding, task planning, success detection, and tool calls [src-039]
- Tool-calling surface: Can call Google Search, vision-language-action models, or third-party user-defined functions [src-039]
- Improvements: Better pointing, counting, success detection, spatial/physical reasoning, multi-view reasoning, instrument reading, and physical safety constraint adherence versus prior baselines [src-039]
What it does
Gemini Robotics-ER 1.6 bridges high-level language and physical robot action by reasoning over visual scenes, camera views, task state, instruments, and safety constraints. It can use pointing as an intermediate reasoning representation, combine camera views to decide whether a task is complete, and use agentic vision with code execution to read physical instruments more accurately [src-039].
Related concepts
- Embodied Reasoning
- Robotic Success Detection
- Robotic Instrument Reading
- Agentic Vision
- Physical Safety Constraints for Robots
- Agentic AI
Source references
- [src-039] Laura Graesser and Peng Xu — “Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning” (2026-04-14)