All tags

Company: "hugging-face"

    not much happened today
    not much happened today
    Mary Meeker is so back: BOND Capital AI Trends report
    Gemini 2.5 Pro Preview 05-06 (I/O edition) - the SOTA vision+coding model
    LlamaCon: Meta AI gets into the Llama API platform business
    Cognition's DeepWiki, a free encyclopedia of all GitHub repos
    gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API
    not much happened today
    Google's Agent2Agent Protocol (A2A)
    not much happened today
    Every 7 Months: The Moore's Law for Agent Autonomy
    Cohere's Command A claims #3 open model spot (after DeepSeek and Gemma)
    not much happened today
    not much happened today
    The new OpenAI Agents Platform
    not much happened today
    DeepSeek's Open Source Stack
    not much happened today
    not much happened today
    not much happened today
    s1: Simple test-time scaling (and Kyutai Hibiki)
    Gemini 2.0 Flash GA, with new Flash Lite, 2.0 Pro, and Flash Thinking
    How To Scale Your Model, by DeepMind
    not much happened today
    TinyZero: Reproduce DeepSeek R1-Zero for $30
    not much happened today
    DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens
    not much happened this weekend
    ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,
    Genesis: Generative Physics Engine for Robotics (o1-mini version)
    Meta Apollo - Video Understanding up to 1 hour, SOTA Open Weights
    OpenAI Sora Turbo and Sora.com
    Meta Llama 3.3: 405B/Nova Pro performance at 70B price
    Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500
    Perplexity starts Shopping for you
    BitNet was a lie?
    not much happened this weekend
    DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality
    Did Nvidia's Nemotron 70B train on test?
    not much happened today
    not much happened today
    not much happened today
    Pixtral 12B: Mistral beats Llama to Multimodality
    not much happened today + AINews Podcast?
    not much happened today
    CogVideoX: Zhipu's Open Source Sora
    Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1
    super quiet day
    Ideogram 2 + Berkeley Function Calling Leaderboard V2
    not much happened today
    Gemini Live
    GPT4o August + 100% Structured Outputs for All (GPT4o mini edition)
    not much happened today
    DataComp-LM: the best open-data 7B model/benchmark/dataset
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)
    SciCode: HumanEval gets a STEM PhD upgrade
    Microsoft AgentInstruct + Orca 3
    FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence
    Qdrant's BM42: "Please don't trust us"
    GraphRAG: The Marriage of Knowledge Graphs and RAG
    Gemini launches context caching... or does it?
    Nemotron-4-340B: NVIDIA's new large open models, built on syndata, great for syndata
    5 small news items
    Mamba-2: State Space Duality
    Life after DPO (RewardBench)
    ALL of AI Engineering in One Place
    Skyfall
    GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)
    Perplexity, the newest AI unicorn
    Meta Llama 3 (8B, 70B)
    Mixtral 8x22B Instruct sparks efficiency memes
    Zero to GPT in 1 Year
    Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention
    Gemini Pro and GPT4T Vision go GA on the same day by complete coincidence
    ReALM: Reference Resolution As Language Modeling
    Not much happened today
    AdamW -> AaronD?
    Jamba: Mixture of Architectures dethrones Mixtral
    DBRX: Best open model (just not most efficient)
    World_sim.exe
    MM1: Apple's first Large Multimodal Model
    FSDP+QLoRA: the Answer to 70b-scale AI for desktop class GPUs
    Not much happened today
    The Era of 1-bit LLMs
    Dia de las Secuelas (StarCoder, The Stack, Dune, SemiAnalysis)
    Mistral Large disappoints
    One Year of Latent Space
    Google AI: Win some (Gemma, 1.5 Pro), Lose some (Image gen)
    The Dissection of Smaug (72B)
    Gemini Ultra is out, to mixed reviews
    Qwen 1.5 Released
    Less Lazy AI
    The Core Skills of AI Engineering
    Trust in GPTs at all time low
    Miqu confirmed to be an early Mistral-medium checkpoint
    CodeLLama 70B beats GPT4 on HumanEval
    RWKV "Eagle" v5: Your move, Mamba
    GPT4Turbo A/B Test: gpt-4-0125-preview
    Adept Fuyu-Heavy: Multimodal model for Agents
    RIP Latent Diffusion, Hello Hourglass Diffusion
    Nightshade poisons AI art... kinda?
    Sama says: GPT-5 soon
    1/17/2024: Help crowdsource function calling datasets
    1/16/2024: ArtificialAnalysis - a new model/host benchmark site
    1/16/2024: TIES-Merging
    1/12/2024: Anthropic coins Sleeper Agents
    1/11/2024: Mixing Experts vs Merging Models
    1/8/2024: The Four Wars of the AI Stack
    1/4/2024: Jeff Bezos backs Perplexity's $520m Series B.
    1/3/2024: RIP Coqui
    12/31/2023: Happy New Year
    12/30/2023: Mega List of all LLMs
    12/29/2023: TinyLlama on the way
    12/23/2023: NeurIPS Best Papers of 2023
    12/19/2023: Everybody Loves OpenRouter
    12/10/2023: not much happened today
    12/9/2023: The Mixtral Rush
    12/8/2023 - Mamba v Mistral v Hyena