405B parameter dense Transformer — the largest open-weight model at release. 128K vocab, 128K context, trained on 15.6T tokens on 16K H100 GPUs. 8B and 70B variants also released.

Llama 3.1 405B was competitive with GPT-4 on many benchmarks, demonstrating that open models had reached frontier quality. AA Intelligence Index: 17. Llama 3 Community License. By the Llama Team.

Model Details

Architecture DENSE
Parameters 405B
Context window 128,000

Variants

Name Parameters Notes
Llama 3.1 8B 8B
Llama 3.1 70B 70B
Llama 3.1 405B 405B

Paper

arXiv: 2407.21783

frontieropen-weight

Related