All tags

Topic: "long-context"

    not much happened today
    ChatGPT Agent: new o* model + unified Deep Research browser + Operator computer use + Code Interpreter terminal
    Voxtral - Mistral's SOTA ASR model in 3B (mini) and 24B ("small") sizes beats OpenAI Whisper large-v3
    Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params
    Grok 4: xAI succeeds in going from 0 to new SOTA LLM in 2 years
    not much happened today
    Zuck goes Superintelligence Founder Mode: $100M bonuses + $100M+ salaries + NFDG Buyout?
    Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview
    not much happened today
    not much happened today
    Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1
    gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API
    not much happened today
    GPT 4.1: The New OpenAI Workhorse
    not much happened today
    LLaDA: Large Language Diffusion Models
    Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2
    Titans: Learning to Memorize at Test Time
    ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,
    not much happened today
    Not much (in AI) happened this weekend
    not much happened today
    a calm before the storm
    not much happened today
    Everybody shipped small things this holiday weekend
    not much happened today
    Summer of Code AI: $1.6b raised, 1 usable product
    CogVideoX: Zhipu's Open Source Sora
    not much happened this weekend
    Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1
    super quiet day
    Gemini Live
    Llama 3.1: The Synthetic Data Model
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)
    Not much happened today.
    Nemotron-4-340B: NVIDIA's new large open models, built on syndata, great for syndata
    5 small news items
    1 TRILLION token context, real time, on device?
    Skyfall
    Not much happened today
    Evals: The Next Generation
    Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention
    Claude 3 is officially America's Next Top Model
    Claude 3 just destroyed GPT 4 (see for yourself)
    Ring Attention for >1M Context
    Google AI: Win some (Gemma, 1.5 Pro), Lose some (Image gen)
    Sora pushes SOTA
    1/8/2024: The Four Wars of the AI Stack
    12/8/2023 - Mamba v Mistral v Hyena