All tags

Topic: "hallucination"

    Gemini 3.1 Pro: 2x 3.0 on ARC-AGI 2
    xAI Grok 4.1: #1 in Text Arena, #1 in EQ-bench, and better Creative Writing
    $100k to predict LMSYS human preferences in a Kaggle contest