AI Lab Tracker
Labs
Timeline
SEED-Bench
dataset
2023-07-30
Tencent
Benchmark for evaluating multimodal large language models across multiple dimensions. SEED-Bench-2 expanded to 24K multiple-choice questions covering 27 evaluation dimensions. Published at CVPR 2024.
Paper v1 (arXiv)
Paper v2 (arXiv)
GitHub
Leaderboard
Dataset
GitHub Repository
benchmark
multimodal
evaluation