All tags

Topic: "long-context"

    not much happened today
    not much happened today
    Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1
    gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API
    not much happened today
    GPT 4.1: The New OpenAI Workhorse
    not much happened today
    LLaDA: Large Language Diffusion Models
    Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2
    Titans: Learning to Memorize at Test Time
    ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,
    not much happened today
    Not much (in AI) happened this weekend
    not much happened today
    a calm before the storm
    not much happened today
    Everybody shipped small things this holiday weekend
    not much happened today
    Summer of Code AI: $1.6b raised, 1 usable product
    CogVideoX: Zhipu's Open Source Sora
    not much happened this weekend
    Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1
    super quiet day
    Gemini Live
    Llama 3.1: The Synthetic Data Model
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)
    Not much happened today.
    Nemotron-4-340B: NVIDIA's new large open models, built on syndata, great for syndata
    5 small news items
    1 TRILLION token context, real time, on device?
    Skyfall
    Not much happened today
    Evals: The Next Generation
    Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention
    Claude 3 is officially America's Next Top Model
    Claude 3 just destroyed GPT 4 (see for yourself)
    Ring Attention for >1M Context
    Google AI: Win some (Gemma, 1.5 Pro), Lose some (Image gen)
    Sora pushes SOTA
    1/8/2024: The Four Wars of the AI Stack
    12/8/2023 - Mamba v Mistral v Hyena