All tags

Topic: "attention-mechanisms"

    not much happened today
    Chinese Models Launch - MiniMax-M1, Hailuo 2 "Kangaroo", Moonshot Kimi-Dev-72B
    Mary Meeker is so back: BOND Capital AI Trends report
    not much happened today
    Llama 4's Controversial Weekend Release
    not much happened today
    not much happened today
    DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens
    not much happened today
    Too Cheap To Meter: AI prices cut 50-70% in last 30 days
    FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence
    GraphRAG: The Marriage of Knowledge Graphs and RAG
    Gemma 2: The Open Model for Everyone
    Shazeer et al (2024): you are overpaying for inference >13x
    12/23/2023: NeurIPS Best Papers of 2023
    12/8/2023 - Mamba v Mistral v Hyena