Topic: "context-windows"

Anthropic Claude Fable 5

Microsoft Build: MAI-Thinking-1 and MAI Family models, Surface RTX Spark Dev Box, and OpenClaw in Windows

not much happened today

Google I/O 2026: Gemini 3.5 Flash, Omni, and Google’s Agent Stack

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

not much happened today

not much happened today

not much happened today

Claude Code Anniversary + Launches from: Qwen 3.5, Cursor Demos, Cognition Devin 2.2, Inception Mercury 2

not much happened today

Terminal-Bench 2.0 and Harbor

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Kimi K2‑0905 and Qwen3‑Max preview: two 1T open weights models launched

nano-banana is Gemini‑2.5‑Flash‑Image, beating Flux Kontext by 170 Elo with SOTA Consistency, Editing, and Multi-Image Fusion

not much happened today

OpenAI rolls out GPT-5 and GPT-5 Thinking to >1B users worldwide; -mini and -nano help claim Pareto Frontier

Gemini 2.5 Deep Think finally ships

Figma's $50+b IPO

not much happened today

SmolLM3: the SOTA 3B reasoning open source LLM

not much happened today

not much happened today

Reasoning Price War 2: Mistral Magistral + o3's 80% price cut + o3-pro

not much happened today

Google's Agent2Agent Protocol (A2A)

DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level

OpenAI adopts MCP

not much happened today

Cohere's Command A claims #3 open model spot (after DeepSeek and Gemma)

not much happened today

DeepSeek's Open Source Stack

not much happened today

GPT 4.5 — Chonky Orion ships!

lots of small launches

not much happened today

Gemini 2.0 Flash GA, with new Flash Lite, 2.0 Pro, and Flash Thinking

Bespoke-Stratos + Sky-T1: The Vicuna+Alpaca moment for reasoning

not much happened today

Perplexity starts Shopping for you

Did Nvidia's Nemotron 70B train on test?

Not much technical happened today

not much happened today

Llama 3.2: On-device 1B/3B, and Multimodal 11B/90B (with AI2 Molmo kicker)

Replit Agent - How did everybody beat Devin to market?

Ideogram 2 + Berkeley Function Calling Leaderboard V2

not much happened today

GPT4o August + 100% Structured Outputs for All (GPT4o August edition)

AlphaProof + AlphaGeometry2 reach 1 point short of IMO Gold

Mistral Large 2 + RIP Mistral 7B, 8x7B, 8x22B

Llama 3.1 Leaks: big bumps to 8B, minor bumps to 70b, and SOTA OSS 405b model

DataComp-LM: the best open-data 7B model/benchmark/dataset

Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)

Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o version)

Qdrant's BM42: "Please don't trust us"

GraphRAG: The Marriage of Knowledge Graphs and RAG

Mozilla's AI Second Act

Nemotron-4-340B: NVIDIA's new large open models, built on syndata, great for syndata

Cursor reaches >1000 tok/s finetuning Llama3-70b for fast file editing

Not much happened today

OpenAI's Instruction Hierarchy for the LLM OS

Llama-3-70b is GPT-4-level Open Model

Cohere Command R+, Anthropic Claude Tool Use, OpenAI Finetuning

Evals-based AI Engineering

Jamba: Mixture of Architectures dethrones Mixtral

GPT4Turbo A/B Test: gpt-4-0125-preview

Is Google's Gemini... legit?