dots.ocr
modelCompact 3B vision-language model that parses document layout, text, tables, and formulas in ~100 languages within a single model — a breakout open-source hit for hi lab, with ~9K GitHub stars, 1.3K+ HuggingFace likes, and 275K+ downloads. Built on a 1.7B LLM with a vision encoder, MIT-licensed and adopted into serving frameworks like SGLang. Succeeded by dots.mocr in March 2026.
Model Details
Architecture DENSE
Parameters 3B
License MIT