Analysis of Key Metrics from the OpenLLM Leaderboard: AI2, HellaSwag, MMLU, TruthfulQA, Winogrande, GSM8k
Share this post
Benchmarking Open Source Language Models
Share this post
Analysis of Key Metrics from the OpenLLM Leaderboard: AI2, HellaSwag, MMLU, TruthfulQA, Winogrande, GSM8k