Series of vision-language models capable of understanding images and complex documents.

Model Details

Variants

Name Parameters Notes
Step-1V
Step-1.5V
Step-2V
multimodalopen-weight

Notes

Step-1V launched Mar 2024. Step-1.5V launched at WAIC Jul 2024.