AI Lab Tracker
Labs
Timeline
Ming-Omni
model
2025-05-28
Ant Group
Ant Group's native multimodal model built on the Ling backbone. Handles vision, speech, audio, and music.
Paper (arXiv)
GitHub
HuggingFace
Blog Post
Paper
arXiv:
2506.09344
multimodal
audio
open-weight
Related
ming-flash-omni-2
ling