All tags

Topic: "model-evaluation"

    not much happened today
    not much happened today
    not much happened today
    not much happened today
    Grok 3 & 3-mini now API Available
    lots of little things happened this week
    OpenAI launches Operator, its first Agent
    not much happened today
    not much happened today
    not much happened today
    AIPhone 16: the Visual Intelligence Phone
    Reflection 70B, by Matt from IT Department
    not much happened today
    Problems with MMLU-Pro
    Qdrant's BM42: "Please don't trust us"
    The Last Hurrah of Stable Diffusion?
    Contextual Position Encoding (CoPE)
    Life after DPO (RewardBench)
    Ten Commandments for Deploying Fine-Tuned Models
    Zero to GPT in 1 Year
    Claude 3 is officially America's Next Top Model
    Claude 3 just destroyed GPT 4 (see for yourself)
    12/10/2023: not much happened today
    12/8/2023 - Mamba v Mistral v Hyena
    Is Google's Gemini... legit?