gen‑ai.news
← Back
Image

Microsoft readies new MAI voice and image models for Build 2026

Microsoft is reportedly preparing to unveil several new models at its Build 2026 developer conference, all grouped under the MAI branding the company has been building out in recent months. The models include MAI-Image-2.5, a generative image model, MAI-Transcribe-1.5 for speech-to-text tasks, and MAI-Voice-2, which is described as supporting multiple languages.

The MAI line represents Microsoft's effort to develop AI models in-house rather than relying entirely on third-party providers, including its close partner OpenAI. Earlier MAI models were introduced quietly through Azure AI Foundry and Microsoft's API offerings, positioning them as practical tools for enterprise developers rather than consumer-facing products.

A multilingual voice model is particularly notable given the competitive landscape in speech synthesis and real-time translation. Accurate, natural-sounding voice generation across languages remains a difficult problem, and enterprise demand for such capabilities in products like Teams, Copilot, and customer service tooling is significant. MAI-Transcribe-1.5 likewise fits into Microsoft's broader push to improve real-time and asynchronous transcription across its productivity suite.

Build 2026 is shaping up to be a dense event for AI announcements, and the MAI model family will likely be positioned as part of Microsoft's Azure AI platform strategy. Developers using Azure AI Foundry would be a primary audience, with these models potentially available through standard API access shortly after the event. Whether the image model competes directly with offerings like DALL-E or takes a different approach - such as focusing on editing or enterprise document workflows - remains to be seen ahead of the official unveiling.

Enjoy this story? Get the next one in your inbox.

Twice a week: the most important stories in generative image and video AI, distilled into a 2-minute read.

Free. Unsubscribe any time. No spam, ever.

Your next read

Image

AI ‘content creators’ are getting harder to spot

AI-generated social media personas have grown harder to distinguish from real people, raising questions about transparency and trust on platforms built around personal identity. Early virtual influencers were visually distinct enough that audiences could easily spot them, but that gap is closing fast. The Verge traces how the technology and the business models around it have matured together.

Image

Meta made its own AI-generated clickbait news feed

Meta's standalone AI app has introduced a 'For You' section that serves up AI-generated news-style articles, complete with AI-produced images and text. The content follows familiar clickbait patterns, raising questions about accuracy and the platform's direction. It marks a notable shift from the app's original focus on a social feed of user-shared AI conversations and images.

Image

Google shuts down the AI image app Pixel Studio

Google is closing Pixel Studio, its AI image generation app for Pixel devices, less than two years after it launched. The shutdown continues a pattern of Google retiring products that failed to gain lasting traction. Users will need to look elsewhere for on-device AI image tools.