OpenSeek
modelOpen-source initiative to unite the global community in developing next-generation language models, inspired by DeepSeek. Uses DeepSeek V3 MoE architecture (64 experts, top-6 routing). OpenSeek-Small v1 is the first-stage model with 1.4B total / 0.4B active parameters trained on 720B tokens. Addresses three challenges: high-quality data acquisition, algorithmic innovation, and distributed training systems. Uses FlagScale for distributed training.
Model Details
Variants
| Name | Parameters | Notes |
|---|---|---|
| OpenSeek-Small-v1-Baseline | — | 1.4B total / 0.4B active, trained on 100B tokens |
| OpenSeek-Small-v1 | — | 1.4B total / 0.4B active, trained on 720B tokens |