Project Genie adds Google Street View integration and goes live for global AI Ultra users

Ground your snow-globe worlds in real-world locations from Google Maps.

Ground your snow-globe worlds in real-world locations from Google Maps.
Twice a week: the most important stories in generative image and video AI, distilled into a 2-minute read.
Free. Unsubscribe any time. No spam, ever.

xAI has updated its Grok Imagine system to version 1.5, adding an image-to-video model that converts still images into short video clips at up to 720p resolution. The new model accepts text prompts to guide motion and style, and multiple generated clips can be joined into longer sequences.

NVIDIA has released Cosmos 3, an open omnimodal foundation model that combines a vision-language reasoning component with a diffusion-based video generator in a two-tower architecture. The system is designed to support physical AI applications by linking language-grounded reasoning with the generation of plausible world states and robot actions.

Nvidia used GTC Taipei to unveil several new tools aimed at physical AI applications, including a new world model, a larger autonomous driving model, and an open reference platform for humanoid robots. The announcements signal a continued push to make simulation and synthetic data central to how robots and vehicles are trained. Here is a closer look at what was shown and why it matters.