o3 | Lab Index

Second-generation reasoning model, advancing o1's test-time compute scaling. 200K token context. Released alongside o4-mini, a smaller reasoning variant. o3-mini launched earlier (January 2025) as a cost-efficient option with selectable reasoning effort (low/medium/high).

o3 achieved 96.7% on AIME 2024 and scored 87.7% on GPQA-Diamond. o3-pro (June 2025) used parallel test-time compute for the highest reasoning accuracy. AA Intelligence Index v4.1: 30 (o3), 19 (o3-mini), 26 (o4-mini), 33 (o3-pro). Proprietary.

Artificial Analysis OpenRouter

Model Details

Parameters (est.) ~ 3.0T

Context window 200,000

AA Intelligence 30

Variants

Name	Parameters	Notes
o3-mini	—	Cost-efficient, January 2025
o3	—	Full model, April 2025
o4-mini	—	Smaller reasoning variant, April 2025
o3-pro	—	Parallel test-time compute, June 2025

frontierreasoning

Model Details

Variants

Related