阿里巴巴推出多模态 AI模型，可在移动设备上处理音频和视频

Alibaba Unveils Qwen2.5-Omni-7B AI for Mobile Devices
阿里巴巴推出 Qwen2.5-Omni-7B，多模态 AI 支持移动设备

Alibaba introduced Qwen2.5-Omni-7B, a multimodal AI model with 7 billion parameters that processes text, images, audio, and video on phones, tablets, and laptops. It supports real-time responses and is open-source on Hugging Face, GitHub, and ModelScope.
阿里巴巴推出Qwen2.5-Omni-7B，这是一款多模态 AI 模型，能在手机、平板和笔记本电脑上处理文本、图像、音频和视频。该模型仅有70 亿参数，但在部分基准测试中超越竞争对手，支持实时响应，并已在 Hugging Face、GitHub 和 ModelScope 开源。

Potential uses include real-time audio descriptions for the visually impaired and step-by-step cooking guidance. The Qwen series has become a key alternative to DeepSeek’s models in China.
该模型可为视障用户提供实时音频描述，或通过分析食材提供烹饪指导。Qwen 系列已成为中国 AI 开发者的重要工具，成为 DeepSeek V3 和 R1 模型的主要替代选择之一。

浏览量: 58

相关文章：

发表回复 取消回复

发表回复取消回复