AI Lab Tracker
Labs
Timeline
What's New
ViT-Lens
paper
2023-08-20
Tencent
Towards omni-modal representations by extending ViT to additional modalities (3D, audio, etc.) via lightweight lens modules. Published at CVPR 2024.
Paper v1 (arXiv)
Paper v2 (arXiv)
GitHub
Project Page
Paper
Venue
CVPR 2024
arXiv
HTML
multimodal
vision
research