OpenAI has announced major upgrades to its Realtime API, making it easier for developers and businesses to create more efficient voice agents. The company also launched its most advanced speech-to-speech model yet, called 'gpt-realtime.' These developments are part of a larger trend in the tech industry this year, where AI agents are being developed to handle tasks on behalf of users.

Feature expansion API update includes remote MCP servers and image inputs The latest update to the Realtime API comes with support for remote Model Context Protocol (MCP) servers, image inputs, and phone calling through Session Initiation Protocol (SIP). These enhancements are aimed at giving voice agents access to more tools and context to assist users better. The upgrades also make it easier for developers and users alike by simplifying the process of connecting AI models with data sources.

Privacy assurance MCP open-standard ensures user privacy The MCP open-standard is a key part of the Realtime API update, ensuring that connections are made while keeping user data and privacy at the forefront. The company hopes these expanded capabilities will make AI tools more helpful by providing them with more information to work with.