Technical paper on a 4B parameter vision-language model using Layout-as-Thought mechanisms.

Model Details

Parameters 4B
visionmultimodal