ViT-Lens | Lab Index

Towards omni-modal representations by extending ViT to additional modalities (3D, audio, etc.) via lightweight lens modules. Published at CVPR 2024.

No results found