AI that learns from videos could be smarter in real life
Meta and NYU researchers found that teaching AI using video clips (in addition to text and images) makes it much better at understanding how things work in real life.
As online text runs out, video gives AI a richer way to learn about cause and effect—like telling it to step out of the shadow and having it actually figure that out.
How video training helps AI
The team trained AI models from scratch using tons of uncaptioned videos, plus some images and text.
This approach let the AI predict changes in scenes almost entirely from regular video footage, with barely any need for special data.
Interestingly, using videos (without captions) kept language skills strong—something that got worse when captioned images were used.
Real-world understanding for AI
This could be a big deal for things like robots or self-driving cars, which need to understand physics and react in the real world.
The researchers think video-trained AIs will be way better at navigating and interacting with their surroundings—basically making them smarter about everyday stuff.