Industry standard for evaluating large models across thousands of dimensions.

Library

GitHub Repository

evaluationbenchmarkframework

Notes

Date approximate.