Mira Murati introduces interaction models for fluid audio video conversations
Thinking Machines founder Mira Murati just introduced "interaction models," a new kind of AI that can process what you say or show it (audio and video) and respond very quickly, not in awkward turns.
The whole idea is to make talking to AI feel more like chatting with a person, fixing those annoying delays we're used to.
Thinking Machines models respond under 0.4s
These models use a "full-duplex" setup, breaking conversations into tiny 200-millisecond chunks for super-fast replies.
One model, a massive 276-billion-parameter model, handles quick chats while another juggles deeper tasks like reasoning or searching the web at the same time.
Thanks to "encoder-free early fusion," responses come in under 0.4 seconds, faster than Google's Gemini-3.1-flash-live on FD-bench.
For now, only research partners get access, but a broader public rollout is expected later this year.
This could be huge for anything needing instant feedback: think health care alerts or real-time customer help.