AI Lab Tracker
Labs
Timeline
What's New
HaploVL
model
2025-03-18
Tencent
Single-transformer baseline for multi-modal understanding, simplifying vision-language model architecture. Published at ICML 2025.
Paper (arXiv)
GitHub
Paper
Venue
ICML 2025
arXiv
HTML
multimodal
vision
research