All tags

Model: "llama-3"

    Llama 4's Controversial Weekend Release
    not much happened today
    Mistral Small 3 24B and Tulu 3 405B
    not much happened today
    DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens
    OpenAI Voice Mode Can See Now - After Gemini Does
    Meta BLT: Tokenizer-free, Byte-level LLM
    not much happened today
    BitNet was a lie?
    Not much happened today
    not much happened this weekend
    Contextual Document Embeddings: `cde-small-v1`
    Not much technical happened today
    not much happened today
    not much happened today
    ChatGPT Advanced Voice Mode
    not much happened today
    not much happened today + AINews Podcast?
    Reflection 70B, by Matt from IT Department
    Ideogram 2 + Berkeley Function Calling Leaderboard V2
    not much happened today
    GPT4o August + 100% Structured Outputs for All (GPT4o mini edition)
    GPT4o August + 100% Structured Outputs for All (GPT4o August edition)
    Apple Intelligence Beta + Segment Anything Model 2
    Llama 3.1: The Synthetic Data Model
    DataComp-LM: the best open-data 7B model/benchmark/dataset
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o version)
    Gemma 2 tops /r/LocalLlama vibe check
    SciCode: HumanEval gets a STEM PhD upgrade
    GraphRAG: The Marriage of Knowledge Graphs and RAG
    Mozilla's AI Second Act
    Nemotron-4-340B: NVIDIA's new large open models, built on syndata, great for syndata
    The Last Hurrah of Stable Diffusion?
    Qwen 2 beats Llama 3 (and we don't know how)
    5 small news items
    Life after DPO (RewardBench)
    GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)
    Quis promptum ipso promptiet?
    LMSys advances Llama 3 eval analysis
    $100k to predict LMSYS human preferences in a Kaggle contest
    Evals: The Next Generation
    Not much happened today
    A quiet weekend
    Apple's OpenELM beats OLMo with 50% of its dataset, using DeLighT
    Snowflake Arctic: Fully Open 10B+128x4B Dense-MoE Hybrid LLM
    OpenAI's Instruction Hierarchy for the LLM OS
    Perplexity, the newest AI unicorn
    FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you're welcome)
    Llama-3-70b is GPT-4-level Open Model
    Lilian Weng on Video Diffusion
    Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention
    Gemini Pro and GPT4T Vision go GA on the same day by complete coincidence
    Cohere Command R+, Anthropic Claude Tool Use, OpenAI Finetuning
    AdamW -> AaronD?
    DeepMind SIMA: one AI, 9 games, 600 tasks, vision+language ONLY
    Google AI: Win some (Gemma, 1.5 Pro), Lose some (Image gen)
    Trust in GPTs at all time low
    1/6-7/2024: LlaMA Pro - an alternative to PEFT/RAG??
    12/11/2023: Mixtral beats GPT3.5 and Llama2-70B