Naive Dynamic Resolution for variable-resolution visual processing. Available in 2B, 8B, and 72B variants.

Paper

arXiv: 2409.12191

multimodalopen-weightvision

Related