'MAI-Image-2': Microsoft's new AI model generates photorealistic images
What's the story
Microsoft has unveiled its latest text-to-image artificial intelligence (AI) model, MAI-Image-2. The new tool is designed to enhance creative workflows by focusing on photorealism, reliable text rendering, and detailed scene generation. The company says the model produces images with natural lighting, accurate skin tones, and realistic environments while reducing post-production work.
Information
Addressing text generation challenge
One of the key upgrades in MAI-Image-2 is its ability to generate consistent in-image text. This opens up possibilities for infographics, slides, posters, and diagrams with greater accuracy. The feature addresses a long-standing limitation in image generation models where text often appears distorted.
Model ranking
MAI-Image-2 performance and ranking
As of March 19, Microsoft's MAI-Image-2 ranks fifth among the 51 models tracked by Arena.ai. The top five spots are dominated by Gemini's three models: 3.1 Flash, 3 Pro Image 2K, and 3 Pro Image. OpenAI's GPT-image-1.5 is second in the rankings. Arena.ai is a crowdsourced platform that ranks large language models (LLMs) and other AI models based on user preferences.
Advanced features
Enhancements for creative professionals
MAI-Image-2 also excels in complex and imaginative outputs, enabling cinematic compositions and surreal visuals. The model was developed with input from photographers, designers, and visual storytellers to better meet practical creative needs. It is now available for preview through the MAI Playground, a public testing environment for Microsoft's in-house AI models.
Integration plans
Integration and API access
Microsoft plans to gradually integrate MAI-Image-2 across its ecosystem, including Microsoft Copilot and Bing Image Creator. API access is currently limited to select enterprise customers such as WPP, with broader developer availability expected soon via Microsoft Foundry. The company has also promised further updates as it expands its next-generation AI infrastructure and model capabilities.