Topic: "model-benchmarking"

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Meta Superintelligence Labs acquires Manus AI for over $2B, at $100M ARR, 9months after launch

not much happened today

not much happened today

DeepSeek V3.2 & 3.2-Speciale: GPT5-High Open Weights, Context Management, Plans for Compute Scaling

MiniMax M2 230BA10B — 8% of Claude Sonnet's price, ~2x faster, new SOTA open model

not much happened today

not much happened today

OpenAI rolls out GPT-5 and GPT-5 Thinking to >1B users worldwide; -mini and -nano help claim Pareto Frontier

not much happened today

not much happened today

not much happened today

not much happened today

ChatGPT responds to GlazeGate + LMArena responds to Cohere

Grok 3 & 3-mini now API Available

OpenAI o3, o4-mini, and Codex CLI

QwQ-32B claims to match DeepSeek R1-671B

Google's Agent2Agent Protocol (A2A)

OpenAI adopts MCP

Gemma 3 beats DeepSeek V3 in Elo, 2.0 Flash beats GPT4o with Native Image Gen

not much happened today

Vision Everywhere: Apple AIMv2 and Jina CLIP v2

DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

a calm before the storm

o1 destroys Lmsys Arena, Qwen 2.5, Kyutai Moshi release

a quiet weekend

Everybody shipped small things this holiday weekend

Execuhires: Tempting The Wrath of Khan

Rombach et al: FLUX.1 [pro|dev|schnell], $31m seed for Black Forest Labs

DataComp-LM: the best open-data 7B model/benchmark/dataset

Mozilla's AI Second Act

Talaria: Apple's new MLOps Superweapon

Not much happened today

Contextual Position Encoding (CoPE)

Not much happened today

Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention

RWKV "Eagle" v5: Your move, Mamba

12/25/2023: Nous Hermes 2 Yi 34B for Christmas

12/10/2023: not much happened today