All tags

Topic: "model-distillation"

    The new OpenAI Agents Platform
    not much happened today
    not much happened today
    Bespoke-Stratos + Sky-T1: The Vicuna+Alpaca moment for reasoning
    Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2
    DeepSeek R1: o1-level open weights model and a simple recipe for upgrading 1.5B models to Sonnet/4o level
    Moondream 2025.1.9: Structured Text, Enhanced OCR, Gaze Detection in a 2B Model
    DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens
    s{imple|table|calable} Consistency Models
    OpenAI Realtime API and other Dev Day Goodies
    not much happened today
    Rombach et al: FLUX.1 [pro|dev|schnell], $31m seed for Black Forest Labs
    Skyfall