All tags

Company: "meta-ai-fair"

    Execuhires Round 2: Scale-Meta, Lamini-AMD, and Instacart-OpenAI
    AI Engineer World's Fair Talks Day 1
    not much happened today
    DeepSeek-R1-0528 - Gemini 2.5 Pro-level model, SOTA Open Weights release
    Mistral's Agents API and the 2025 LLM OS
    not much happened today
    Granola launches team notes, while Notion launches meeting transcription
    not much happened today
    Prime Intellect's INTELLECT-2 and PRIME-RL advance distributed reinforcement learning
    not much happened today
    ChatGPT responds to GlazeGate + LMArena responds to Cohere
    LlamaCon: Meta AI gets into the Llama API platform business
    Cognition's DeepWiki, a free encyclopedia of all GitHub repos
    Google's Agent2Agent Protocol (A2A)
    DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level
    not much happened today
    lots of little things happened this week
    Every 7 Months: The Moore's Law for Agent Autonomy
    not much happened today
    not much happened today
    not much happened today
    TinyZero: Reproduce DeepSeek R1-Zero for $30
    not much happened today
    not much happened today
    Titans: Learning to Memorize at Test Time
    not much happened this weekend
    ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,
    Genesis: Generative Physics Engine for Robotics (o1-mini version)
    OpenAI Voice Mode Can See Now - After Gemini Does
    Meta Apollo - Video Understanding up to 1 hour, SOTA Open Weights
    Meta BLT: Tokenizer-free, Byte-level LLM
    ChatGPT Canvas GA
    Meta Llama 3.3: 405B/Nova Pro performance at 70B price
    Stripe lets Agents spend money with StripeAgentToolkit
    Gemini (Experimental-1114) retakes #1 LLM rank with 1344 Elo
    not much happened today
    Not much happened today
    Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data
    OpenAI beats Anthropic to releasing Speculative Decoding
    not much happened today
    not much happened today
    DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
    DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality
    not much happened today
    Not much (in AI) happened this weekend
    not much happened today
    State of AI 2024
    not much happened today
    not much happened this weekend
    Contextual Document Embeddings: `cde-small-v1`
    Not much technical happened today
    Liquid Foundation Models: A New Transformers alternative + AINews Pod 2
    not much happened today
    not much happened today
    Llama 3.2: On-device 1B/3B, and Multimodal 11B/90B (with AI2 Molmo kicker)
    not much happened today
    not much happened today
    Pixtral 12B: Mistral beats Llama to Multimodality
    not much happened today
    CogVideoX: Zhipu's Open Source Sora
    not much happened this weekend
    Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1
    Ideogram 2 + Berkeley Function Calling Leaderboard V2
    not much happened today
    not much happened today
    not much happened today
    GPT4o August + 100% Structured Outputs for All (GPT4o August edition)
    How Carlini Uses AI
    Execuhires: Tempting The Wrath of Khan
    Gemma 2 2B + Scope + Shield
    not much happened today
    Apple Intelligence Beta + Segment Anything Model 2
    AlphaProof + AlphaGeometry2 reach 1 point short of IMO Gold
    Mistral Large 2 + RIP Mistral 7B, 8x7B, 8x22B
    Llama 3.1: The Synthetic Data Model
    Llama 3.1 Leaks: big bumps to 8B, minor bumps to 70b, and SOTA OSS 405b model
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)
    We Solved Hallucinations
    Nothing much happened today
    Test-Time Training, MobileLLM, Lilian Weng on Hallucination (Plus: Turbopuffer)
    Problems with MMLU-Pro
    That GPT-4o Demo
    Gemini launches context caching... or does it?
    Qwen 2 beats Llama 3 (and we don't know how)
    Contextual Position Encoding (CoPE)
    Somebody give Andrej some H100s already
    Life after DPO (RewardBench)
    Clémentine Fourrier on LLM evals
    Chameleon: Meta's (unreleased) GPT4o-like Omnimodal Model
    Evals: The Next Generation
    Apple's OpenELM beats OLMo with 50% of its dataset, using DeLighT
    Perplexity, the newest AI unicorn
    FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you're welcome)
    Llama-3-70b is GPT-4-level Open Model
    Meta Llama 3 (8B, 70B)
    Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention
    Gemini Pro and GPT4T Vision go GA on the same day by complete coincidence
    Cohere Command R+, Anthropic Claude Tool Use, OpenAI Finetuning
    DeepMind SIMA: one AI, 9 games, 600 tasks, vision+language ONLY
    FSDP+QLoRA: the Answer to 70b-scale AI for desktop class GPUs
    Qwen 1.5 Released
    CodeLLama 70B beats GPT4 on HumanEval
    RIP Latent Diffusion, Hello Hourglass Diffusion
    1/2/2024: Smol tweaks to Smol Talk
    12/26/2023: not much happened today