All tags

Topic: "model-architecture"

    not much happened today
    Gemini 2.5 Pro (06-05) launched at AI Engineer World's Fair
    AI Engineer World's Fair Talks Day 1
    Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1
    small news items
    not much happened today
    Olympus has dropped (aka, Amazon Nova Micro|Lite|Pro|Premier|Canvas|Reel)
    Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data
    OpenAI beats Anthropic to releasing Speculative Decoding
    Not much technical happened today
    Pixtral 12B: Mistral beats Llama to Multimodality
    $1150m for SSI, Sakana, You.com + Claude 500m context
    Test-Time Training, MobileLLM, Lilian Weng on Hallucination (Plus: Turbopuffer)
    The Last Hurrah of Stable Diffusion?
    HippoRAG: First, do know(ledge) Graph
    OpenAI's PR Campaign?
    Evals: The Next Generation
    Music's Dall-E moment
    Not much happened today
    Jamba: Mixture of Architectures dethrones Mixtral
    DBRX: Best open model (just not most efficient)
    Grok-1 in Bio
    DeepMind SIMA: one AI, 9 games, 600 tasks, vision+language ONLY
    1/4/2024: Jeff Bezos backs Perplexity's $520m Series B.