All tags

Model: "claude-4.1-opus"

    xAI Grok 4.1: #1 in Text Arena, #1 in EQ-bench, and better Creative Writing
    GDPVal finding: Claude Opus 4.1 within 95% of AGI (human experts in top 44 white collar jobs)
    OpenAI rolls out GPT-5 and GPT-5 Thinking to >1B users worldwide; -mini and -nano help claim Pareto Frontier
    OpenAI's gpt-oss 20B and 120B, Claude Opus 4.1, DeepMind Genie 3