· 2024-01-13 16:16:59+01:00
Chat with Open Large Language Models
estimated reading time: < 1 min Two more benchmarks are displayed: MT-Bench and MMLU.MT-Bench: a set of challenging multi-turn questions. We use GPT-4 to grade the model responses. MMLU (5-shot): a test to measure a model’s multitask accuracy on 57 tasks.
estimated reading time: < 1 min Two more benchmarks are displayed: MT-Bench and MMLU.MT-Bench: a set of challenging multi-turn questions. We use GPT-4 to grade the model responses. MMLU (5-shot): a test to measure a model’s multitask accuracy on 57 tasks.
Segui: Newsletter · Telegram