Listen

Description

NVIDIA introduces DreamZero, a 14-billion parameter World Action Model that jointly predicts future video and robot actions from a video diffusion backbone. Unlike Vision-Language-Action models that fail on physically novel tasks, DreamZero achieves over 2x improvement on generalization benchmarks and enables zero-shot transfer to unseen tasks like untying shoelaces — suggesting that the path to better robot policies runs through better video generation.