Run MMLU on any LLM

Choose LLM
Benchmark Results
Public Model Results

Need a better benchmark? Try our Arena.