AI Lab Tracker
Labs
Timeline
What's New
SEED-Bench
eval
2023-07-30
Tencent
Benchmark for evaluating multimodal large language models across multiple dimensions. SEED-Bench-2 expanded to 24K multiple-choice questions covering 27 evaluation dimensions. Published at CVPR 2024.
Paper v1 (arXiv)
Paper v2 (arXiv)
GitHub
Leaderboard
Dataset
GitHub
benchmark
multimodal
evaluation