All tags

Company: "deepseek-ai"

    DeepSeek-R1-0528 - Gemini 2.5 Pro-level model, SOTA Open Weights release
    not much happened today
    QwQ-32B claims to match DeepSeek R1-671B
    lots of small launches
    not much happened today
    OpenAI launches Operator, its first Agent
    Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2
    DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens
    Meta BLT: Tokenizer-free, Byte-level LLM
    ChatGPT Canvas GA
    not much happened today
    not much happened today
    Pixtral 12B: Mistral beats Llama to Multimodality
    Too Cheap To Meter: AI prices cut 50-70% in last 30 days
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o version)
    Gemma 2 tops /r/LocalLlama vibe check
    DeepSeek-V2 beats Mixtral 8x22B with >160 experts at HALF the cost
    1/11/2024: Mixing Experts vs Merging Models
    1/10/2024: All the best papers for AI Engineers