Company: "deepseek"

not much happened today

not much happened today

GLM 5.2: the top Frontend Coding model in the world, IndexShare reduces costs

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

not much happened today

Apple picks Google's Gemini to power Siri's next generation

not much happened today

OpenAI GPT Image-1.5 claims to beat Nano Banana Pro, #1 across all Arenas, but completely fails Vibe Checks

not much happened today

MCP -> Agentic AI Foundation, Mistral Devstral 2

OpenRouter's State of AI - An Empirical 100 Trillion Token Study

not much happened today

not much happened today

Anthropic Claude Sonnet 4.5, Claude Code 2.0, new VS Code Extensions

GDPVal finding: Claude Opus 4.1 within 95% of AGI (human experts in top 44 white collar jobs)

Qwen3-Next-80B-A3B-Base: Towards Ultimate Training & Inference Efficiency

not much happened today

Cohere Command A Reasoning beats GPT-OSS-120B and DeepSeek R1 0528

DeepSeek V3.1: 840B token continued pretrain, beating Claude 4 Sonnet at 11% of its cost

Databricks' $100B Series K

Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params

not much happened today

not much happened today

Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview

Chinese Models Launch - MiniMax-M1, Hailuo 2 "Kangaroo", Moonshot Kimi-Dev-72B

not much happened today

Mary Meeker is so back: BOND Capital AI Trends report

not much happened today

ChatGPT Codex, OpenAI's first cloud SWE agent

codex-1 openai-o3 codex-mini gemma-3 blip3-o qwen-2.5 marigold-iid deepseek-v3 lightlab gemini-2.0 lumina-next openai runway salesforce qwen deepseek google google-deepmind j1 software-engineering parallel-processing multimodality diffusion-models depth-estimation scaling-laws reinforcement-learning fine-tuning model-performance multi-turn-conversation reasoning audio-processing sama kevinweil omarsar0 iscienceluvr akhaliq osanseviero c_valenzuelab mervenoyann arankomatsuzaki jasonwei demishassabis philschmid swyx teortaxestex jaseweston

OpenAI launched Codex, a cloud-based software engineering agent powered by codex-1 (an optimized version of OpenAI o3) available in research preview for Pro, Enterprise, and Team ChatGPT users, featuring parallel task execution like refactoring and bug fixing. The Codex CLI was enhanced with quick sign-in and a new low-latency model, codex-mini. Gemma 3 is highlighted as the best open model runnable on a single GPU. Runway released the Gen-4 References API for style transfer in generation. Salesforce introduced BLIP3-o, a unified multimodal model family using diffusion transformers for CLIP image features. The Qwen 2.5 models (1.5B and 3B versions) were integrated into the PocketPal app with various chat templates. Marigold IID, a new state-of-the-art open-source depth estimation model, was released. In research, DeepSeek shared insights on scaling and hardware for DeepSeek-V3. Google unveiled LightLab, a diffusion-based light source control in images. Google DeepMind's AlphaEvolve uses Gemini 2.0 to discover new math and reduce costs without reinforcement learning. Omni-R1 studied audio's role in fine-tuning audio LLMs. Qwen proposed a parallel scaling law inspired by classifier-free guidance. Salesforce released Lumina-Next on the Qwen base, outperforming Janus-Pro. A study found LLM performance degrades in multi-turn conversations due to unreliability. J1 is incentivizing LLM-as-a-Judge thinking via reinforcement learning. A new Qwen study correlates question and strategy similarity to predict reasoning strategies.

not much happened today

not much happened today

Cursor @ $9b, OpenAI Buys Windsurf @ $3b

not much happened today

not much happened today

Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1

not much happened today

Google's Agent2Agent Protocol (A2A)

not much happened today

not much happened today

not much happened today

>$41B raised today (OpenAI @ 300b, Cursor @ 9.5b, Etched @ 1.5b)

not much happened today

Halfmoon is Reve Image: a new SOTA Image Model from ex-Adobe/Stability trio

not much happened today

The new OpenAI Agents Platform

not much happened today

DeepSeek's Open Source Stack

Anthropic's $61.5B Series E

not much happened today

The Ultra-Scale Playbook: Training LLMs on GPU Clusters

not much happened today

not much happened today

s1: Simple test-time scaling (and Kyutai Hibiki)

How To Scale Your Model, by DeepMind

o3-mini launches, OpenAI on "wrong side of history"

Mistral Small 3 24B and Tulu 3 405B

not much happened today

not much happened today

DeepSeek #1 on US App Store, Nvidia stock tanks -17%

TinyZero: Reproduce DeepSeek R1-Zero for $30

Bespoke-Stratos + Sky-T1: The Vicuna+Alpaca moment for reasoning

DeepSeek R1: o1-level open weights model and a simple recipe for upgrading 1.5B models to Sonnet/4o level

not much happened today

not much happened today

PRIME: Process Reinforcement through Implicit Rewards

not much happened to end the year

not much happened today

not much happened today

Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500

LMSys killed Model Versioning (gpt 4o 1120, gemini exp 1121)

DeepSeek-R1 claims to beat o1-preview AND will be open sourced

Common Corpus: 2T Open Tokens with Provenance

DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality

o1 destroys Lmsys Arena, Qwen 2.5, Kyutai Moshi release

DataComp-LM: the best open-data 7B model/benchmark/dataset

FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence

Mozilla's AI Second Act

Gemini Nano: 50-90% of Gemini Pro, <100ms inference, on device, in Chrome Canary

Gemini launches context caching... or does it?

Snowflake Arctic: Fully Open 10B+128x4B Dense-MoE Hybrid LLM

OpenAI's Instruction Hierarchy for the LLM OS

Ring Attention for >1M Context

Qwen 1.5 Released

Adept Fuyu-Heavy: Multimodal model for Agents

12/25/2023: Nous Hermes 2 Yi 34B for Christmas

12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)