Major upgrade of SenseTime's flagship large model family, launched at SenseTime Tech Day in Shanghai. Trained on over 10TB of tokens with synthetic data, adopts Mixture of Experts architecture with ~200K effective context window. Claimed to outperform GPT-4 Turbo on multiple benchmarks including MMBench for multimodal evaluation. Introduced the "Cloud-to-Edge" full-stack deployment matrix.

Model Details

Architecture MOE
Context window 200,000
frontiermultimodalmoe