Topic: "model-architecture"

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

The Claude Code Source Leak

not much happened today

not much happened today

Claude Code Anniversary + Launches from: Qwen 3.5, Cursor Demos, Cognition Devin 2.2, Inception Mercury 2

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

not much happened today

Apple picks Google's Gemini to power Siri's next generation

not much happened today

OpenAI GPT Image-1.5 claims to beat Nano Banana Pro, #1 across all Arenas, but completely fails Vibe Checks

OpenRouter's State of AI - An Empirical 100 Trillion Token Study

OpenAI fires back: GPT-5.1-Codex-Max (API) and GPT 5.1 Pro (ChatGPT)

not much happened today

MiniMax M2 230BA10B — 8% of Claude Sonnet's price, ~2x faster, new SOTA open model

DeepSeek-OCR finds vision models can decode 10x more efficiently with ~97% accuracy of text-only, 33/200k pages/day/A100

not much happened today

Grok 4 Fast: Xai's distilled, 40% more token efficient, 2m context, 344 tok/s frontier model

Softbank, NVIDIA and US Govt take 2%, 5% and 10% of Intel, will develop Intel x86 RTX SOCs for consumer & datacenters

not much happened today

not much happened today

Databricks' $100B Series K

OpenAI rolls out GPT-5 and GPT-5 Thinking to >1B users worldwide; -mini and -nano help claim Pareto Frontier

OpenAI's gpt-oss 20B and 120B, Claude Opus 4.1, DeepMind Genie 3

not much happened today

not much happened today

SmolLM3: the SOTA 3B reasoning open source LLM

OpenAI releases Deep Research API (o3/o4-mini)

Zuck goes Superintelligence Founder Mode: $100M bonuses + $100M+ salaries + NFDG Buyout?

not much happened today

Gemini 2.5 Pro (06-05) launched at AI Engineer World's Fair

AI Engineer World's Fair Talks Day 1

Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1

small news items

not much happened today

Olympus has dropped (aka, Amazon Nova Micro|Lite|Pro|Premier|Canvas|Reel)

Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data

OpenAI beats Anthropic to releasing Speculative Decoding

Not much technical happened today

Pixtral 12B: Mistral beats Llama to Multimodality

$1150m for SSI, Sakana, You.com + Claude 500m context

Test-Time Training, MobileLLM, Lilian Weng on Hallucination (Plus: Turbopuffer)

The Last Hurrah of Stable Diffusion?

HippoRAG: First, do know(ledge) Graph

OpenAI's PR Campaign?

Evals: The Next Generation

Music's Dall-E moment

Not much happened today

Jamba: Mixture of Architectures dethrones Mixtral

DBRX: Best open model (just not most efficient)

DeepMind SIMA: one AI, 9 games, 600 tasks, vision+language ONLY

1/4/2024: Jeff Bezos backs Perplexity's $520m Series B.