All tags

Topic: "scaling"

    OpenAI o3, o4-mini, and Codex CLI
    QwQ-32B claims to match DeepSeek R1-671B
    The Ultra-Scale Playbook: Training LLMs on GPU Clusters
    not much happened today
    Shazeer et al (2024): you are overpaying for inference >13x
    Contextual Position Encoding (CoPE)
    Anthropic's "LLM Genome Project": learning & clamping 34m features on Claude Sonnet
    OpenAI's PR Campaign?
    Llama-3-70b is GPT-4-level Open Model