All tags

Company: "deepseek"

    not much happened today
    Mary Meeker is so back: BOND Capital AI Trends report
    not much happened today
    ChatGPT Codex, OpenAI's first cloud SWE agent
    not much happened today
    not much happened today
    Cursor @ $9b, OpenAI Buys Windsurf @ $3b
    not much happened today
    not much happened today
    Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1
    not much happened today
    Google's Agent2Agent Protocol (A2A)
    not much happened today
    not much happened today
    not much happened today
    >$41B raised today (OpenAI @ 300b, Cursor @ 9.5b, Etched @ 1.5b)
    not much happened today
    Halfmoon is Reve Image: a new SOTA Image Model from ex-Adobe/Stability trio
    not much happened today
    The new OpenAI Agents Platform
    not much happened today
    DeepSeek's Open Source Stack
    Anthropic's $61.5B Series E
    not much happened today
    The Ultra-Scale Playbook: Training LLMs on GPU Clusters
    not much happened today
    not much happened today
    s1: Simple test-time scaling (and Kyutai Hibiki)
    How To Scale Your Model, by DeepMind
    o3-mini launches, OpenAI on "wrong side of history"
    Mistral Small 3 24B and Tulu 3 405B
    not much happened today
    not much happened today
    DeepSeek #1 on US App Store, Nvidia stock tanks -17%
    TinyZero: Reproduce DeepSeek R1-Zero for $30
    Bespoke-Stratos + Sky-T1: The Vicuna+Alpaca moment for reasoning
    DeepSeek R1: o1-level open weights model and a simple recipe for upgrading 1.5B models to Sonnet/4o level
    not much happened today
    not much happened today
    PRIME: Process Reinforcement through Implicit Rewards
    not much happened to end the year
    not much happened today
    not much happened today
    Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500
    LMSys killed Model Versioning (gpt 4o 1120, gemini exp 1121)
    DeepSeek-R1 claims to beat o1-preview AND will be open sourced
    Common Corpus: 2T Open Tokens with Provenance
    DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality
    o1 destroys Lmsys Arena, Qwen 2.5, Kyutai Moshi release
    DataComp-LM: the best open-data 7B model/benchmark/dataset
    FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence
    Mozilla's AI Second Act
    Gemini Nano: 50-90% of Gemini Pro, <100ms inference, on device, in Chrome Canary
    There's Ilya!
    Gemini launches context caching... or does it?
    Snowflake Arctic: Fully Open 10B+128x4B Dense-MoE Hybrid LLM
    OpenAI's Instruction Hierarchy for the LLM OS
    Ring Attention for >1M Context
    Qwen 1.5 Released
    Adept Fuyu-Heavy: Multimodal model for Agents
    12/25/2023: Nous Hermes 2 Yi 34B for Christmas
    12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)