Anthropic: ↩️ Claude 3.5 Sonnet sets new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval).
It shows marked improvement in grasping nuance, humor, and complex instructions, all while writing with a natural tone.
Thu Jun 20 2024 22:03:07 GMT+0800 (China Standard Time)