Run MMLU on any LLM
Choose LLM
API Endpoint
Model Name
Access Token
Use MMLU-Light (faster evaluation)
Run Benchmark
Cancel
Benchmark Results
Public Model Results
Need a better benchmark? Try our
Arena
.