Sparse MoE Transformer with a breakthrough 1M token context window (up to 10M in research). Near-perfect recall (>99.7%) across the full context. Exceeded Gemini 1.0 Ultra performance at Pro-tier efficiency.

The 1M context window was a defining capability leap, enabling processing of entire codebases, books, and hour-long videos in a single prompt. AA Intelligence Index: 16. Proprietary.

Model Details

Architecture MOE
Context window 1,000,000

Paper

arXiv: 2403.05530

frontiermultimodalmoe

Related