1 article tagged #benchmarks.
Frameworks and metrics for measuring AI system performance, quality, and safety, from standard benchmarks to domain-specific evaluations.