Company: "openai"

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

OpenAI launches GPT 5.6 Sol/Terra/Luna

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Anthropic's Claude Opus 4.7

not much happened today

not much happened today

not much happened today

not much happened today

Anthropic @ $30B ARR, Project GlassWing and Claude Mythos Preview — first model too dangerous to release since GPT-2

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Autoresearch: Sparks of Recursive Self Improvement

not much happened today

GPT 5.4: SOTA Knowledge Work -and- Coding -and- CUA Model, OpenAI is so very back

not much happened today

not much happened today

not much happened today

OpenAI closes $110B raise from Amazon, NVIDIA, SoftBank in largest startup fundraise in history @ $840B post-money

Agentic Engineering: WTF Happened in December 2025?

Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".

Claude Code Anniversary + Launches from: Qwen 3.5, Cursor Demos, Cognition Devin 2.2, Inception Mercury 2

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

Qwen-Image 2.0 and Seedance 2.0

not much happened today

not much happened today

OpenAI and Anthropic go to war: Claude Opus 4.6 vs GPT 5.3 Codex

ElevenLabs $500m Series D at $11B, Cerebras $1B Series H at $23B, Vibe Coding -> Agentic Engineering

OpenAI Codex App: death of the VSCode fork, multitasking worktrees, Skills Automations

not much happened today

Anthropic launches the MCP Apps open spec, in Claude.ai

not much happened today

OpenEvidence, the ‘ChatGPT for doctors,’ raises $250m at $12B valuation, 12x from $1b last Feb

ChatGPT starts testing ads on free tier + new $8/mo Go plan in the US

Open Responses: explicit spec for OpenAI's Responses API supported by OpenRouter, Ollama, Huggingface, vLLM, et al

not much happened today.

Apple picks Google's Gemini to power Siri's next generation

not much happened today

not much happened today

not much happened today

Nvidia buys (most of) Groq for $20B cash; largest execuhire ever

not much happened today

Claude Skills grows: Open Standard, Directory, Org Admin

OpenAI GPT Image-1.5 claims to beat Nano Banana Pro, #1 across all Arenas, but completely fails Vibe Checks

not much happened today

GPT-5.2 (Instant/Thinking/Pro): 74% on GDPVal, 1.4x cost of GPT 5.1, on 10 Year OpenAI Anniversary

not much happened today

MCP -> Agentic AI Foundation, Mistral Devstral 2

not much happened today

Nano Banana Pro (Gemini Image Pro) solves text-in-images, infographic generation, 2-4k resolution, and Google Search grounding

OpenAI fires back: GPT-5.1-Codex-Max (API) and GPT 5.1 Pro (ChatGPT)

xAI Grok 4.1: #1 in Text Arena, #1 in EQ-bench, and better Creative Writing

not much happened today

minor updates to GPT 5.1 and SIMA 2

GPT 5.1 in ChatGPT: No evals, but adaptive thinking and instruction following

not much happened today

not much happened today

not much happened today

not much happened today

Cursor 2.0 & Composer-1: Fast Models and New Agents UI

OpenAI completes Microsoft + For-profit restructuring + announces 2028 AI Researcher timeline + Platform / AI cloud product direction + next $1T of compute

not much happened today

not much happened today

ChatGPT Atlas: OpenAI's AI Browser

The Karpathy-Dwarkesh Interview delays AGI timelines

Claude Agent Skills - glorified AGENTS.md? or MCP killer?

OpenAI Titan XPU: 10GW of self-designed chips with Broadcom

not much happened today

Gemini 2.5 Computer Use preview beats Sonnet 4.5 and OAI CUA

OpenAI Dev Day: Apps SDK, AgentKit, Codex GA, GPT‑5 Pro and Sora 2 APIs

not much happened today

not much happened today

Thinking Machines' Tinker: LoRA based LLM fine-tuning API

Sora 2: new video+audio model and OpenAI's first Social Network

Anthropic Claude Sonnet 4.5, Claude Code 2.0, new VS Code Extensions

GDPVal finding: Claude Opus 4.1 within 95% of AGI (human experts in top 44 white collar jobs)

not much happened today

NVIDIA to invest $100B in OpenAI for 10GW of Vera Rubin rollout

not much happened today

not much happened today

GPT-5 Codex launch and OpenAI's quiet rise in Agentic Coding

not much happened today

Oracle jumps +36% in a day after winning $300B OpenAI contract

not much happened today

not much happened today

not much happened today

not much happened today

OpenAI Realtime API GA and new `gpt-realtime` model, 20% cheaper than 4o

OpenAI updates Codex, VSCode Extension that can sync tasks with Codex Cloud

nano-banana is Gemini‑2.5‑Flash‑Image, beating Flux Kontext by 170 Elo with SOTA Consistency, Editing, and Multi-Image Fusion

Databricks' $100B Series K

not much happened today

Western Open Models get Funding: Cohere $500m @ 6.8B, AI2 gets $152m NSF+NVIDIA grants

not much happened today

not much happened today

OpenAI's IMO Gold model also wins IOI Gold

not much happened today

OpenAI rolls out GPT-5 and GPT-5 Thinking to >1B users worldwide; -mini and -nano help claim Pareto Frontier

not much happened today

OpenAI's gpt-oss 20B and 120B, Claude Opus 4.1, DeepMind Genie 3

Qwen-Image: SOTA text rendering + 4o-imagegen-level Editing Open Weights MMDiT

Gemini 2.5 Deep Think finally ships

Figma's $50+b IPO

not much happened today

not much happened today

GLM-4.5: Deeper, Headier, & better than Kimi/Qwen/DeepSeek (SOTA China LLM?)

not much happened today

3x in 3 months: Cursor @ $28b, Cognition + Windsurf @ $10b

not much happened today

OAI and GDM announce IMO Gold-level results with natural language reasoning, no specialized training or tools, under human time limits

ChatGPT Agent: new o* model + unified Deep Research browser + Operator computer use + Code Interpreter terminal

not much happened today

not much happened today

not much happened today

SmolLM3: the SOTA 3B reasoning open source LLM

not much happened today

not much happened today

not much happened today

not much happened today

OpenAI releases Deep Research API (o3/o4-mini)

Context Engineering: Much More than Prompts

Not much happened today

minor ai followups: MultiAgents, Meta-SSI-Scale, Karpathy, AI Engineer

Zuck goes Superintelligence Founder Mode: $100M bonuses + $100M+ salaries + NFDG Buyout?

Chinese Models Launch - MiniMax-M1, Hailuo 2 "Kangaroo", Moonshot Kimi-Dev-72B

Execuhires Round 2: Scale-Meta, Lamini-AMD, and Instacart-OpenAI

Reasoning Price War 2: Mistral Magistral + o3's 80% price cut + o3-pro

Apple exposes Foundation Models API and... no new Siri

Gemini 2.5 Pro (06-05) launched at AI Engineer World's Fair

AI Engineer World's Fair Talks Day 1

not much happened today

not much happened today

Mistral's Agents API and the 2025 LLM OS

not much happened today

not much happened today

OpenAI buys Jony Ive's io for $6.5b, LMArena lands $100m seed from a16z

ChatGPT Codex, OpenAI's first cloud SWE agent

codex-1 openai-o3 codex-mini gemma-3 blip3-o qwen-2.5 marigold-iid deepseek-v3 lightlab gemini-2.0 lumina-next openai runway salesforce qwen deepseek google google-deepmind j1 software-engineering parallel-processing multimodality diffusion-models depth-estimation scaling-laws reinforcement-learning fine-tuning model-performance multi-turn-conversation reasoning audio-processing sama kevinweil omarsar0 iscienceluvr akhaliq osanseviero c_valenzuelab mervenoyann arankomatsuzaki jasonwei demishassabis philschmid swyx teortaxestex jaseweston

OpenAI launched Codex, a cloud-based software engineering agent powered by codex-1 (an optimized version of OpenAI o3) available in research preview for Pro, Enterprise, and Team ChatGPT users, featuring parallel task execution like refactoring and bug fixing. The Codex CLI was enhanced with quick sign-in and a new low-latency model, codex-mini. Gemma 3 is highlighted as the best open model runnable on a single GPU. Runway released the Gen-4 References API for style transfer in generation. Salesforce introduced BLIP3-o, a unified multimodal model family using diffusion transformers for CLIP image features. The Qwen 2.5 models (1.5B and 3B versions) were integrated into the PocketPal app with various chat templates. Marigold IID, a new state-of-the-art open-source depth estimation model, was released. In research, DeepSeek shared insights on scaling and hardware for DeepSeek-V3. Google unveiled LightLab, a diffusion-based light source control in images. Google DeepMind's AlphaEvolve uses Gemini 2.0 to discover new math and reduce costs without reinforcement learning. Omni-R1 studied audio's role in fine-tuning audio LLMs. Qwen proposed a parallel scaling law inspired by classifier-free guidance. Salesforce released Lumina-Next on the Qwen base, outperforming Janus-Pro. A study found LLM performance degrades in multi-turn conversations due to unreliability. J1 is incentivizing LLM-as-a-Judge thinking via reinforcement learning. A new Qwen study correlates question and strategy similarity to predict reasoning strategies.

Gemini's AlphaEvolve agent uses Gemini 2.0 to find new Math and cuts Gemini cost 1% — without RL

Granola launches team notes, while Notion launches meeting transcription

not much happened today

not much happened today

not much happened today

Cursor @ $9b, OpenAI Buys Windsurf @ $3b

not much happened today

ChatGPT responds to GlazeGate + LMArena responds to Cohere

Cognition's DeepWiki, a free encyclopedia of all GitHub repos

not much happened today

gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API

not much happened today; New email provider for AINews

Grok 3 & 3-mini now API Available

Gemini 2.5 Flash completes the total domination of the Pareto Frontier

OpenAI o3, o4-mini, and Codex CLI

QwQ-32B claims to match DeepSeek R1-671B

SOTA Video Gen: Veo 2 and Kling 2 are GA for developers

GPT 4.1: The New OpenAI Workhorse

not much happened today

not much happened today

Google's Agent2Agent Protocol (A2A)

not much happened today

not much happened today

not much happened today

>$41B raised today (OpenAI @ 300b, Cursor @ 9.5b, Etched @ 1.5b)

not much happened today

not much happened today

OpenAI adopts MCP

Gemini 2.5 Pro + 4o Native Image Gen

Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

not much happened today

Gemma 3 beats DeepSeek V3 in Elo, 2.0 Flash beats GPT4o with Native Image Gen

The new OpenAI Agents Platform

not much happened today

DeepSeek's Open Source Stack

not much happened today

Anthropic's $61.5B Series E

not much happened today

GPT 4.5 — Chonky Orion ships!

lots of small launches

AI Engineer Summit Day 1

not much happened today

X.ai Grok 3 and Mira Murati's Thinking Machines

not much happened today

Reasoning Models are Near-Superhuman Coders (OpenAI IOI, Nvidia Kernels)

small news items

not much happened today

not much happened today

OpenAI takes on Gemini's Deep Research

o3-mini launches, OpenAI on "wrong side of history"

not much happened today

not much happened today

DeepSeek #1 on US App Store, Nvidia stock tanks -17%

TinyZero: Reproduce DeepSeek R1-Zero for $30

OpenAI launches Operator, its first Agent

Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2

not much happened today

Titans: Learning to Memorize at Test Time

small little news items

Moondream 2025.1.9: Structured Text, Enhanced OCR, Gaze Detection in a 2B Model

not much happened today

not much happened today

PRIME: Process Reinforcement through Implicit Rewards

not much happened today

not much happened today

not much happened today

DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens

not much happened today

not much happened this weekend

o3 solves AIME, GPQA, Codeforces, makes 11 years of progress in ARC-AGI and 25% in FrontierMath

ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,

Genesis: Generative Physics Engine for Robotics (o1-mini version)

Genesis: Generative Physics Engine for Robotics (o1-2024-12-17)

OpenAI Voice Mode Can See Now - After Gemini Does

o1 API, 4o/4o-mini in Realtime API + WebRTC, DPO Finetuning

Meta Apollo - Video Understanding up to 1 hour, SOTA Open Weights

Meta BLT: Tokenizer-free, Byte-level LLM

Google wakes up: Gemini 2.0 et al

ChatGPT Canvas GA

OpenAI Sora Turbo and Sora.com

Meta Llama 3.3: 405B/Nova Pro performance at 70B price

$200 ChatGPT Pro and o1-full/pro, with vision, without API, and mixed reviews

not much happened today

LMSys killed Model Versioning (gpt 4o 1120, gemini exp 1121)

Stripe lets Agents spend money with StripeAgentToolkit

Gemini (Experimental-1114) retakes #1 LLM rank with 1344 Elo

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

not much happened today

OpenAI beats Anthropic to releasing Speculative Decoding

not much happened today

The AI Search Wars Have Begun — SearchGPT, Gemini Grounding, and more

Creating a LLM-as-a-Judge

GitHub Copilot Strikes Back

not much happened this weekend

not much happened today

not much happened today

DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

not much happened today

not much happened today

Not much (in AI) happened this weekend

not much happened today

The AI Nobel Prize

not much happened this weekend

Contextual Document Embeddings: `cde-small-v1`

Canvas: OpenAI's answer to Claude Artifacts

Not much technical happened today

OpenAI Realtime API and other Dev Day Goodies

Liquid Foundation Models: A New Transformers alternative + AINews Pod 2

not much happened today

ChatGPT Advanced Voice Mode

a calm before the storm

not much happened today

not much happened today

o1 destroys Lmsys Arena, Qwen 2.5, Kyutai Moshi release

nothing much happened today

a quiet weekend

Learnings from o1 AMA

o1: OpenAI's new general reasoning models

Pixtral 12B: Mistral beats Llama to Multimodality

AIPhone 16: the Visual Intelligence Phone

Everybody shipped small things this holiday weekend

not much happened today

Ideogram 2 + Berkeley Function Calling Leaderboard V2

not much happened today

The DSPy Roadmap

not much happened today

Grok 2! and ChatGPT-4o-latest confuses everybody

not much happened today

GPT4o August + 100% Structured Outputs for All (GPT4o August edition)

How Carlini Uses AI

Execuhires: Tempting The Wrath of Khan

Gemma 2 2B + Scope + Shield

Llama 3.1 Leaks: big bumps to 8B, minor bumps to 70b, and SOTA OSS 405b model

DataComp-LM: the best open-data 7B model/benchmark/dataset

Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)

Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o version)

We Solved Hallucinations

FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence

Nothing much happened today

RouteLLM: RIP Martian? (Plus: AINews Structured Summaries update)

That GPT-4o Demo

Mozilla's AI Second Act

Claude Crushes Code - 92% HumanEval and Claude.ai Artifacts

Is this... OpenQ*?

Francois Chollet launches $1m ARC Prize

HippoRAG: First, do know(ledge) Graph

5 small news items

Not much happened today

Contextual Position Encoding (CoPE)

Somebody give Andrej some H100s already

Life after DPO (RewardBench)

Ten Commandments for Deploying Fine-Tuned Models

ALL of AI Engineering in One Place

Chameleon: Meta's (unreleased) GPT4o-like Omnimodal Model

Cursor reaches >1000 tok/s finetuning Llama3-70b for fast file editing

Not much happened today

GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)

GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4O version)

Quis promptum ipso promptiet?

LMSys advances Llama 3 eval analysis

OpenAI's PR Campaign?

Kolmogorov-Arnold Networks: MLP killers or just spicy MLPs?

DeepSeek-V2 beats Mixtral 8x22B with >160 experts at HALF the cost

$100k to predict LMSYS human preferences in a Kaggle contest

Evals: The Next Generation

Not much happened today

Snowflake Arctic: Fully Open 10B+128x4B Dense-MoE Hybrid LLM

OpenAI's Instruction Hierarchy for the LLM OS

FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you're welcome)

Lilian Weng on Video Diffusion

Zero to GPT in 1 Year

Gemini Pro and GPT4T Vision go GA on the same day by complete coincidence

Anime pfp anon eclipses $10k A::B prompting challenge

Cohere Command R+, Anthropic Claude Tool Use, OpenAI Finetuning

ReALM: Reference Resolution As Language Modeling

Not much happened today

AdamW -> AaronD?

Evals-based AI Engineering

DBRX: Best open model (just not most efficient)

Andrew likes Agents

The world's first fully autonomous AI Engineer

... and welcome AI Twitter!

Welcome Interconnects and OpenRouter

Mistral Large disappoints

Sora pushes SOTA

Gemini Ultra is out, to mixed reviews

MetaVoice & RIP Bard

Trust in GPTs at all time low

GPT4Turbo A/B Test: gpt-4-0125-preview

GPT4Turbo A/B Test: gpt-4-1106-preview

RIP Latent Diffusion, Hello Hourglass Diffusion

Sama says: GPT-5 soon

1/13-14/2024: Don't sleep on #prompt-engineering

1/12/2024: Anthropic coins Sleeper Agents

1/10/2024: All the best papers for AI Engineers

1/9/2024: Nous Research lands $5m for Open Source AI

1/8/2024: The Four Wars of the AI Stack

1/6-7/2024: LlaMA Pro - an alternative to PEFT/RAG??

1/2/2024: Smol tweaks to Smol Talk

1/1/2024: How to start with Open Source AI

12/29/2023: TinyLlama on the way

12/24/2023: Dolphin Mixtral 8x7b is wild

12/22/2023: Anyscale's Benchmark Criticisms

12/21/2023: The State of AI (according to LangChain)

12/20/2023: Project Obsidian - Multimodal Mistral 7B from Nous

12/19/2023: Everybody Loves OpenRouter

12/18/2023: Gaslighting Mistral for fun and profit

12/16/2023: ByteDance suspended by OpenAI

12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)

12/14/2023: $1e7 for Superalignment

12/13/2023 SOLAR10.7B upstages Mistral7B?

12/12/2023: Towards LangChain 0.1

12/11/2023: Mixtral beats GPT3.5 and Llama2-70B

12/10/2023: not much happened today

12/7/2023: Anthropic says "skill issue"

Is Google's Gemini... legit?