gen‑ai.news
← Back
Video

Amazon, Nvidia, and AMD bet $310 million on AI startup building 3D world models

Odyssey ML has closed a $310 million funding round backed by three of the largest names in compute and cloud infrastructure - Amazon, Nvidia, and AMD. The deal values the startup at $1.45 billion, a figure that reflects growing investor conviction that world models represent a meaningful next step beyond large language models. Other participants include In-Q-Tel (IQT), a venture fund with ties to the CIA, and Jeff Dean, chief scientist at Google, lending the round both strategic and technical weight.

World models are AI systems trained to build internal representations of environments - not just language or flat images, but structured, spatially coherent depictions of how the physical world looks and behaves. For generative media, this has direct implications: a capable 3D world model could power scene generation, virtual environment construction, and simulation at a level of consistency and controllability that current diffusion-based image and video tools struggle to achieve.

The involvement of hardware makers Nvidia and AMD is notable. Both companies have a direct commercial interest in the workloads that world models would require - training and running these systems is expected to be significantly more compute-intensive than standard image or video generation. Amazon's participation likely reflects interest in both the cloud infrastructure angle and potential applications in areas like robotics and spatial computing, where AWS and Amazon's broader hardware efforts have been expanding.

Odyssey ML is entering a space that has attracted increasing attention from researchers and investors alike. Meta, Google DeepMind, and several well-funded startups have all been developing world modeling capabilities, particularly in the context of physical AI and autonomous systems. What distinguishes Odyssey ML's approach - and how its 3D models will be made available, whether as research tools, APIs, or integrated products - has not been fully detailed publicly, but the scale of this round suggests its backers see a credible path to building foundational infrastructure for spatially aware generative AI.

Enjoy this story? Get the next one in your inbox.

Twice a week: the most important stories in generative image and video AI, distilled into a 2-minute read.

Free. Unsubscribe any time. No spam, ever.

Your next read

No image
Video

Snap spins off AI video team into new company, Dotmo, due to costs

Snap is spinning off its internal AI video team into a new independent company called Dotmo, with the move driven primarily by the high costs of developing generative video technology in-house. The staff involved are departing Snap to focus solely on AI video work under the new entity. It marks another instance of Snap shedding an internal unit rather than continuing to absorb the expense of frontier AI development.

No image
Video

Meet Qwen-RobotSuite: Three Embodied AI Models for VLA Manipulation, Video World Modeling, and Navigation

The Qwen team has released Qwen-RobotSuite, a collection of three specialized models targeting different challenges in embodied AI: physical manipulation, world modeling, and navigation. Each model draws on existing Qwen language and vision foundations while introducing architecture and training choices tuned for robotics tasks. The release comes with benchmark results and details on the data pipelines used to train each system.

Video

Cutback launches AI tool to automate long-form video editing

Cutback has introduced Selects, an AI editing assistant designed to handle the early, time-consuming stages of long-form video editing. The tool ingests raw footage, organizes it automatically, and produces a draft edit based on a single text prompt. It targets creators and editors who spend significant time just getting footage into a workable shape before any real editing begins.