Model: "gpt-4"

not much happened today

not much happened today

not much happened today

not much happened to end the week

Gemini (Experimental-1114) retakes #1 LLM rank with 1344 Elo

s{imple|table|calable} Consistency Models

a calm before the storm

not much happened today

AIPhone 16: the Visual Intelligence Phone

Ideogram 2 + Berkeley Function Calling Leaderboard V2

not much happened today

a quiet weekend

SciCode: HumanEval gets a STEM PhD upgrade

RouteLLM: RIP Martian? (Plus: AINews Structured Summaries update)

Mozilla's AI Second Act

Hybrid SSM/Transformers > Pure SSMs/Pure Transformers

The Last Hurrah of Stable Diffusion?

Francois Chollet launches $1m ARC Prize

HippoRAG: First, do know(ledge) Graph

Qwen 2 beats Llama 3 (and we don't know how)

Life after DPO (RewardBench)

Cursor reaches >1000 tok/s finetuning Llama3-70b for fast file editing

OpenAI's PR Campaign?

Kolmogorov-Arnold Networks: MLP killers or just spicy MLPs?

DeepSeek-V2 beats Mixtral 8x22B with >160 experts at HALF the cost

$100k to predict LMSYS human preferences in a Kaggle contest

Evals: The Next Generation

FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you're welcome)

Multi-modal, Multi-Aspect, Multi-Form-Factor AI

ReALM: Reference Resolution As Language Modeling

DBRX: Best open model (just not most efficient)

Andrew likes Agents

The world's first fully autonomous AI Engineer

FSDP+QLoRA: the Answer to 70b-scale AI for desktop class GPUs

Inflection-2.5 at 94% of GPT4, and Pi at 6m MAU

Not much happened today

Claude 3 just destroyed GPT 4 (see for yourself)

Welcome Interconnects and OpenRouter

Karpathy emerges from stealth?

The Dissection of Smaug (72B)

MetaVoice & RIP Bard

RWKV "Eagle" v5: Your move, Mamba

GPT4Turbo A/B Test: gpt-4-1106-preview

RIP Latent Diffusion, Hello Hourglass Diffusion

Sama says: GPT-5 soon

1/10/2024: All the best papers for AI Engineers

1/6-7/2024: LlaMA Pro - an alternative to PEFT/RAG??

12/23/2023: NeurIPS Best Papers of 2023

12/22/2023: Anyscale's Benchmark Criticisms

12/21/2023: The State of AI (according to LangChain)

12/20/2023: Project Obsidian - Multimodal Mistral 7B from Nous

12/19/2023: Everybody Loves OpenRouter

12/16/2023: ByteDance suspended by OpenAI

12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)

12/14/2023: $1e7 for Superalignment

12/13/2023 SOLAR10.7B upstages Mistral7B?

12/12/2023: Towards LangChain 0.1

12/11/2023: Mixtral beats GPT3.5 and Llama2-70B

12/10/2023: not much happened today

12/8/2023 - Mamba v Mistral v Hyena

12/7/2023: Anthropic says "skill issue"

Is Google's Gemini... legit?