Gemma 3
modelFirst multimodal Gemma (vision + text). 128K token context (4B+). 140+ languages. Distilled training from larger Gemini models. Gemma-3-4B competitive with Gemma-2-27B; Gemma-3-27B comparable to Gemini 1.5 Pro.
Sizes: 270M, 1B, 4B, 12B, 27B. The 27B variant became one of the most popular open models for fine-tuning and deployment. Apache 2.0.
Model Details
Architecture DENSE
Parameters 27B
Context window 128,000
Variants
| Name | Parameters | Notes |
|---|---|---|
| Gemma 3 1B | 1B | — |
| Gemma 3 4B | 4B | — |
| Gemma 3 12B | 12B | — |
| Gemma 3 27B | 27B | — |
Paper
arXiv: 2503.19786