Company: "hugging-face"

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

xAI raises $20B Series E at ~$230B valuation

Claude Skills grows: Open Standard, Directory, Org Admin

not much happened today

Nano Banana Pro (Gemini Image Pro) solves text-in-images, infographic generation, 2-4k resolution, and Google Search grounding

Terminal-Bench 2.0 and Harbor

not much happened today

not much happened today

Qwen3-Next-80B-A3B-Base: Towards Ultimate Training & Inference Efficiency

not much happened today

not much happened today

Databricks' $100B Series K

Western Open Models get Funding: Cohere $500m @ 6.8B, AI2 gets $152m NSF+NVIDIA grants

not much happened today

not much happened today

Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params

not much happened today

not much happened today

not much happened today

The Quiet Rise of Claude Code vs Codex

not much happened today

not much happened today

Mary Meeker is so back: BOND Capital AI Trends report

Gemini 2.5 Pro Preview 05-06 (I/O edition) - the SOTA vision+coding model

LlamaCon: Meta AI gets into the Llama API platform business

Cognition's DeepWiki, a free encyclopedia of all GitHub repos

gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API

not much happened today

Google's Agent2Agent Protocol (A2A)

not much happened today

Every 7 Months: The Moore's Law for Agent Autonomy

Cohere's Command A claims #3 open model spot (after DeepSeek and Gemma)

not much happened today

not much happened today

The new OpenAI Agents Platform

not much happened today

DeepSeek's Open Source Stack

not much happened today

not much happened today

not much happened today

s1: Simple test-time scaling (and Kyutai Hibiki)

Gemini 2.0 Flash GA, with new Flash Lite, 2.0 Pro, and Flash Thinking

How To Scale Your Model, by DeepMind

not much happened today

TinyZero: Reproduce DeepSeek R1-Zero for $30

not much happened today

DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens

not much happened this weekend

ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,

Genesis: Generative Physics Engine for Robotics (o1-mini version)

Meta Apollo - Video Understanding up to 1 hour, SOTA Open Weights

OpenAI Sora Turbo and Sora.com

Meta Llama 3.3: 405B/Nova Pro performance at 70B price

Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500

Perplexity starts Shopping for you

BitNet was a lie?

not much happened this weekend

DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality

Did Nvidia's Nemotron 70B train on test?

not much happened today

not much happened today

not much happened today

Pixtral 12B: Mistral beats Llama to Multimodality

not much happened today + AINews Podcast?

not much happened today

CogVideoX: Zhipu's Open Source Sora

Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1

super quiet day

Ideogram 2 + Berkeley Function Calling Leaderboard V2

not much happened today

GPT4o August + 100% Structured Outputs for All (GPT4o mini edition)

not much happened today

DataComp-LM: the best open-data 7B model/benchmark/dataset

Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)

SciCode: HumanEval gets a STEM PhD upgrade

Microsoft AgentInstruct + Orca 3

FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence

Qdrant's BM42: "Please don't trust us"

GraphRAG: The Marriage of Knowledge Graphs and RAG

Gemini launches context caching... or does it?

Nemotron-4-340B: NVIDIA's new large open models, built on syndata, great for syndata

5 small news items

Mamba-2: State Space Duality

Life after DPO (RewardBench)

ALL of AI Engineering in One Place

GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)

Perplexity, the newest AI unicorn

Meta Llama 3 (8B, 70B)

Mixtral 8x22B Instruct sparks efficiency memes

Zero to GPT in 1 Year

Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention

Gemini Pro and GPT4T Vision go GA on the same day by complete coincidence

ReALM: Reference Resolution As Language Modeling

Not much happened today

AdamW -> AaronD?

Jamba: Mixture of Architectures dethrones Mixtral

DBRX: Best open model (just not most efficient)

MM1: Apple's first Large Multimodal Model

FSDP+QLoRA: the Answer to 70b-scale AI for desktop class GPUs

Not much happened today

The Era of 1-bit LLMs

Dia de las Secuelas (StarCoder, The Stack, Dune, SemiAnalysis)

Mistral Large disappoints

One Year of Latent Space

Google AI: Win some (Gemma, 1.5 Pro), Lose some (Image gen)

The Dissection of Smaug (72B)

Gemini Ultra is out, to mixed reviews

Qwen 1.5 Released

The Core Skills of AI Engineering

Trust in GPTs at all time low

Miqu confirmed to be an early Mistral-medium checkpoint

CodeLLama 70B beats GPT4 on HumanEval

codellama miqu mistral-medium llama-2-70b aphrodite-engine mixtral flatdolphinmaid noromaid rpcal chatml mistral-7b activation-beacon eagle-7b rwkv-v5 openhermes2.5 nous-hermes-2-mixtral-8x7b-dpo imp-v1-3b bakllava moondream qwen-vl meta-ai-fair ollama nous-research mistral-ai hugging-face ai-ethics alignment gpu-optimization direct-prompt-optimization fine-tuning cuda-programming optimizer-technology quantization multimodality context-length dense-retrieval retrieval-augmented-generation multilinguality model-performance open-source code-generation classification vision

Meta AI surprised the community with the release of CodeLlama, an open-source model now available on platforms like Ollama and MLX for local use. The Miqu model sparked debate over its origins, possibly linked to Mistral Medium or a fine-tuned Llama-2-70b, alongside discussions on AI ethics and alignment risks. The Aphrodite engine showed strong performance on A6000 GPUs with specific configurations. Role-playing AI models such as Mixtral and Flatdolphinmaid faced challenges with repetitiveness, while Noromaid and Rpcal performed better, with ChatML and DPO recommended for improved responses. Learning resources like fast.ai's course were highlighted for ML/DL beginners, and fine-tuning techniques with optimizers like Paged 8bit lion and adafactor were discussed. At Nous Research AI, the Activation Beacon project introduced a method for unlimited context length in LLMs using "global state" tokens, potentially transforming retrieval-augmented models. The Eagle-7B model, based on RWKV-v5, outperformed Mistral in benchmarks with efficiency and multilingual capabilities. OpenHermes2.5 was recommended for consumer hardware due to its quantization methods. Multimodal and domain-specific models like IMP v1-3b, Bakllava, Moondream, and Qwen-vl were explored for classification and vision-language tasks. The community emphasized centralizing AI resources for collaborative research.

RWKV "Eagle" v5: Your move, Mamba

GPT4Turbo A/B Test: gpt-4-0125-preview

Adept Fuyu-Heavy: Multimodal model for Agents

RIP Latent Diffusion, Hello Hourglass Diffusion

Nightshade poisons AI art... kinda?

Sama says: GPT-5 soon

1/17/2024: Help crowdsource function calling datasets

1/16/2024: ArtificialAnalysis - a new model/host benchmark site

1/16/2024: TIES-Merging

1/12/2024: Anthropic coins Sleeper Agents

1/11/2024: Mixing Experts vs Merging Models

1/8/2024: The Four Wars of the AI Stack

1/4/2024: Jeff Bezos backs Perplexity's $520m Series B.

1/3/2024: RIP Coqui

12/31/2023: Happy New Year

12/30/2023: Mega List of all LLMs

12/29/2023: TinyLlama on the way

12/23/2023: NeurIPS Best Papers of 2023

12/19/2023: Everybody Loves OpenRouter

12/10/2023: not much happened today

12/9/2023: The Mixtral Rush

12/8/2023 - Mamba v Mistral v Hyena