MiniCPM-V 2.6 | Lab Index

A breakthrough 8B-parameter multimodal model (built on Qwen2-7B) that surpassed GPT-4V in single-image, multi-image, and video understanding tasks. It supports real-time inference on mobile devices and iPads, introducing advanced spatio-temporal compression for video processing and Needle-in-a-Haystack retrieval for long-context multimodal inputs.

No results found