Alibaba unveils new AI model with multimodal capabilities

Reuters

The model aims to enhance Alibaba’s presence in the generative AI sector and is available as open-source.

Alibaba Group has introduced Qwen2.5-Omni-7B, the latest addition to its Qwen AI family, designed to handle multiple input types, including text, images, audio, and video. The model, unveiled Thursday, brings enhanced AI features to users across various devices, strengthening Alibaba’s role in the generative AI market.

According to Alibaba, the model can generate real-time responses in both text and audio formats, making it valuable for applications such as assisting visually impaired users with audio descriptions or providing step-by-step cooking guidance based on ingredient analysis.

By expanding beyond text-based AI, the model reflects the rising demand for more versatile AI systems. Alibaba's foundational Qwen models are already widely used by developers and are considered one of China’s key alternatives to DeepSeek’s V3 and R1 models.

Tags

Comments (0)

What is your opinion on this topic?

Leave the first comment