Open-source multilingual text embedding model family optimized for agentic AI systems. Decoder-only architecture with last-token pooling and L2 normalization. Trained via contrastive learning on large-scale multilingual data with knowledge distillation from larger embedding models. Supports 94 languages with 32K token context.

The 27B variant achieves #1 on MTEB-v2 (74.3), the industry-standard multilingual embedding benchmark. Designed as a foundational layer for memory, ranking, and orchestration in agent-based systems — enabling cross-source retrieval, persistent memory over extended interactions, and dynamic context updates across multi-step tasks. MIT License.

Model Details

Architecture DENSE

Variants

Name Parameters Notes
Harrier 27B 27B MTEB-v2: 74.3, 5376-dim embeddings
Harrier 0.6B 0.6B MTEB-v2: 69.0, 1024-dim embeddings
Harrier 270M 270M MTEB-v2: 66.5, 640-dim embeddings
open-weightembeddingmultilingual