Frozen AI News archive

OpenRouter's State of AI - An Empirical 100 Trillion Token Study

**OpenRouter** released its first survey showing usage trends with 7 trillion tokens proxied weekly, highlighting a 52% roleplay bias. **Deepseek**'s open model market share has sharply declined due to rising coding model usage. Reasoning model token usage surged from 0% to over 50%. **Grok Code Fast** shows high usage, while **Anthropic** leads in tool calling and coding requests with around 60% share. Input tokens quadrupled and output tokens tripled this year, driven mainly by programming use cases, which dominate spending and volume. Google launched **Gemini 3 Deep Think**, featuring parallel thinking and achieving 45.1% on ARC-AGI-2 benchmarks, and previewed **Titans**, a long-context neural memory architecture scaling beyond 2 million tokens. These advances were shared by **Google DeepMind** and **Google AI** on Twitter.

Canonical issue URL

Data is all you need.

AI News for 12/3/2025-12/4/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (205 channels, and 7543 messages) for you. Estimated reading time saved (at 200wpm): 563 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

OpenRouter's first survey is out, in web and pdf forms, and it is delightfully well done. Obviously OpenRouter has a bias (52% usage is for ahem "roleplay"), and there are other token consumers with higher volume, but OR is the ~only player that has this level of open data proxying 7T tokens per week.

Some picks:

Deepseek's 50% open model marketshare has plummeted

A stacked area chart showing the decline of DeepSeek's market share in open-source AI models over time, with increasing fragmentation an

mostly because coding rose and nobody uses deepseek for coding:

A stacked bar chart showing DeepSeek's most popular AI model usage categories over several weeks in 2025, with roleplay and casual

Reasoning models went from 0 to >50% usage

A line graph showing the increasing proportion of reasoning tokens used by AI models over time, rising from near 0% to over 50% by November

Grok Code Fast is weirdly high usage even excluding free promo:

A bar chart showing the top used AI models by token volume, with Grok Code Fast 1 leading, followed by Google's Gem

Anthropic dominates tool calling and koding

A stacked bar chart showing the share of programming requests by different AI model providers over several weeks, with Anthropic dominating around 60%

:

A stacked bar chart showing the top 10 most used AI models with 'Tool-Call' finish reason across different months in 2

Input tokens 4xed, output tokens 3xed this year...

A graph showing the growth of prompt and completion tokens over time, illustrating the increasing complexity and length of AI model interactions.

... only because of programing usecases

Line graph showing average number of tokens per request across different domains, with programming (orange line) having the highest and most variable token count over time.

A stacked area chart showing the changing proportions of different AI model usage categories over time, with programming increasing from 11% to 50%

... which are at a sweet spot of spend and volume

A scatter plot showing log cost versus log usage for different AI workload categories like programming, technology, science, and translation, highlighting variations across mass-


AI Twitter Recap

Reasoning and Model Architecture: Gemini 3 Deep Think and Google’s “Titans”

Coding Models and Agent Harnesses

Video, Vision, and Generative Media

Agents, Scaffolds, and Reliability (what’s working in prod)

Evaluation, Measurement, and Trust

Org Moves and Ecosystem

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Microsoft VibeVoice-Realtime Model Launch

2. Humorous Quant Legend Comparison

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Gemini 3 Deep Think Release and Benchmarks

2. Z-Image Prompting and Styles

3. AI's Impact on Tech Jobs and Society


AI Discord Recap

A summary of Summaries of Summaries by gpt-5.1

1. Frontier Coding Models, OpenRouter Trends, and IDE Integrations

2. Security, Jailbreaking, and Agent Execution Safety

3. GPU Systems, Quantization, and Kernel Competitions

4. New Optimization, Evaluation, and Research Directions

5. On‑Device, Small Models, and Agent/Tool Ecosystems


Discord: High level Discord summaries

LMArena Discord


BASI Jailbreaking Discord


Unsloth AI (Daniel Han) Discord


Cursor Community Discord


LM Studio Discord


Perplexity AI Discord


OpenAI Discord


OpenRouter Discord


Nous Research AI Discord


Moonshot AI (Kimi K-2) Discord


GPU MODE Discord


Latent Space Discord


Eleuther Discord


Yannick Kilcher Discord


HuggingFace Discord


Modular (Mojo 🔥) Discord


aider (Paul Gauthier) Discord


DSPy Discord


Manus.im Discord Discord


tinygrad (George Hotz) Discord


MCP Contributors (Official) Discord


Windsurf Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1282 messages🔥🔥🔥):

Profitable AI Companies, LM Arena Prompt Limits, Gemini 3 Deepmind, Frame-Flow Model, OpenAI Models - Robin-High


LMArena ▷ #announcements (3 messages):

Search Arena Leaderboard, New Model in Text Arena, Text-to-Image Arena, Image Edit Arena, Seedream-4.5


BASI Jailbreaking ▷ #general (1240 messages🔥🔥🔥):

Grok Real?, Gemini vs Claude for malware, NSFW Gemini Web Jailbreak, GPT5.1 Jailbreak, AI Agent Attack Scenarios


BASI Jailbreaking ▷ #jailbreaking (69 messages🔥🔥):

Gemini Jailbreak, ENI JB on Gemini 2.5, WormGPT Origins, GPT5.1 Jailbreaking, Grok 4.1 Instructions


BASI Jailbreaking ▷ #redteaming (6 messages):

Jailbreaking zapgpt2, Finding SMTP servers


Unsloth AI (Daniel Han) ▷ #general (281 messages🔥🔥):

llama.cpp vulkan on Asahi Linux, Running LLMs on Phones via llama.cpp, Unsloth community, New Ministral 3 Reinforcement Learning notebook for Sodoku, Gemini 3 Pro vs Claude


Unsloth AI (Daniel Han) ▷ #off-topic (555 messages🔥🔥🔥):

Nvidia VRAM, Trainable SFX models, Levenshtein names, whisper language detector, Crete vacation


Unsloth AI (Daniel Han) ▷ #help (30 messages🔥):

Ollama crashes with Unsloth dynamic quant for Qwen3-VL, AttributeError in trainer.train() with custom classification task, Unsloth installation overwrites torch with CPU version, WSL2 setup for Windows 11


Unsloth AI (Daniel Han) ▷ #research (49 messages🔥):

ToS Violations and Model Training, Distillation impact, Model extractions, Model characteristics


Cursor Community ▷ #general (449 messages🔥🔥🔥):

Grok Code, Professional UIs, Cursor Agent, Platform UI Changes, Open Empathic


LM Studio ▷ #general (355 messages🔥🔥):

CachyOS desktop environment choices, Dual GPU issues with CachyOS, CachyOS telemetry, Qwen in LM Studio, GPT OSS


LM Studio ▷ #hardware-discussion (75 messages🔥🔥):

DDR4 Speed Viability, eBay Mac Studio Scams, Multi-PSU GPU Wiring, Intercepting LLM Requests Server, Triple GPU Bugginess


Perplexity AI ▷ #general (399 messages🔥🔥):

Comet Browser: Spyware Accusations, Perplexity Minecraft Server, Opus 4.5 availability and limits, Image generation limits, Perplexity and Prompt Engineering


Perplexity AI ▷ #sharing (1 messages):

nike0656: https://www.perplexity.ai/search/5f87b568-aa15-4dd6-801a-786a6bedd45b


OpenAI ▷ #ai-discussions (264 messages🔥🔥):

Sora AI Availability in Europe, AI-generated Text Detection, OpenAI Discord Channel Guidelines, Model Preferences, Gemini 3 Pro vs. GPT-5.1 Thinking


OpenAI ▷ #gpt-4-discussions (4 messages):

Branches Command Suspected Glitch, Indifference Acknowledged


OpenAI ▷ #prompt-engineering (62 messages🔥🔥):

Repeatability in Prompt Engineering, Interaction-Level Stability, Agent-Style Prompts vs. Conversational Prompts, Vendor Substrate vs. User-Facing Side, Persistence of Induced Behaviors


OpenAI ▷ #api-discussions (62 messages🔥🔥):

Repeatability in Prompt Engineering, Interaction-Level Stability, Topological Prompt Templates, Vendor System Prompts, Functional Resets vs. Apparent Resets


OpenRouter ▷ #announcements (2 messages):

State of AI report, LLM usage analysis, Roleplay and creative interaction in AI, Coding as a Killer App for Paid Models, Rise of AI Agents


OpenRouter ▷ #app-showcase (2 messages):

Deep Chat, OpenRouter AI models


OpenRouter ▷ #general (274 messages🔥🔥):

Grok 4.1 fast, Cloudflare downtime, DeepSeek V3.2, LiteRouter OR wrapper for RP, OpenAI new model next week?


OpenRouter ▷ #discussion (11 messages🔥):

Anthropic acquires Bun, Claude code generation, Future acquisitions by Cursor, OAI vs Google


Nous Research AI ▷ #announcements (2 messages):

Hermes 4.3, ByteDance Seed 36B, Psyche network, Solana, Office Hours


Nous Research AI ▷ #general (202 messages🔥🔥):

Nous Hermes 4.3, Mistral-3 Hermes Fine-Tunes, Model of Experts (MoE) Support, Ollama GUI for Ubuntu, Roguelike AI


Nous Research AI ▷ #ask-about-llms (41 messages🔥):

3D Simulation Space in Godot, NLP economic simulation research, Langchain framework, AI tools, Bytedance Hermes model


Nous Research AI ▷ #research-papers (1 messages):

teknium: https://x.com/rosinality/status/1996432241908752462?s=46


Nous Research AI ▷ #research-papers (1 messages):

teknium: https://x.com/rosinality/status/1996432241908752462?s=46


Moonshot AI (Kimi K-2) ▷ #general-chat (225 messages🔥🔥):

Deepseek V3.2, Kimi vs Deepseek, Kimi for Coding, Gemini vs Deepseek, Fun with LMs


GPU MODE ▷ #general (2 messages):

Nemotron Speed, Async RL MLsys papers


GPU MODE ▷ #cuda (1 messages):

CUDA kernel optimization, Nsight Compute warnings, LDGSTS.E.BYPASS.LTC128B.128 instruction, cp.async instruction, Register usage and occupancy


GPU MODE ▷ #cool-links (2 messages):

Sparse Attention Mechanisms, Verified Sparse Attention, Programming Languages + Verification and ML


GPU MODE ▷ #jobs (1 messages):

Workflow Automation, RAG Pipelines, AI Content Detection, Image AI, Voice AI


GPU MODE ▷ #pmpp-book (1 messages):

vim410: i can help you get the book signed by WM 😄


GPU MODE ▷ #torchao (10 messages🔥):

Compilation Time, MoE Layer, filter_fn, MoEQuantConfig, FqnToConfig


GPU MODE ▷ #off-topic (4 messages):

MLSys Mentorship, ML4H Programs


GPU MODE ▷ #irl-meetup (1 messages):

rbyrots: anyone in Austin TX? few events I'll be going to


GPU MODE ▷ #submissions (18 messages🔥):

nvfp4_gemm, NVIDIA performance, leaderboard submissions


GPU MODE ▷ #hardware (2 messages):

H100, SFCompute, Prime Intellect, B200 Pricing


GPU MODE ▷ #teenygrad (2 messages):

OpCode Refactoring, IR Implementation Progress, OpNode Improvements


GPU MODE ▷ #nvidia-competition (63 messages🔥🔥):

cuBLAS Version, FP4 Range and Inf Issues, LLM Cheating, NanoTrace and Triton Kernels, Submitting to nvfp4_gemm


GPU MODE ▷ #robotics-vla (2 messages):

Perturbation Experiment, VLA


Latent Space ▷ #ai-general-chat (66 messages🔥🔥):

Antithesis Series A led by Jane Street, Anthropic's Revenue, Tinyboxes and Tinygrad, Claude's coding performance issues, Harvey Bags $160M Series F


Latent Space ▷ #genmedia-creative-ai (12 messages🔥):

Kling Video 2.6, AI-image showdown, Microsoft VibeVoice


Eleuther ▷ #general (32 messages🔥):

SLMs for agents, Emergent Misalignment, Cloud et al subliminal learning paper, Local DeepSeek server errors, Training pipelines for small LMs on 16GB VRAM


Eleuther ▷ #research (33 messages🔥):

Shampoo, Adam Random Rotations, CFG Memorization, Attention Sinks


Eleuther ▷ #interpretability-general (9 messages🔥):

SHD CCP 01Constant Universal Communication Protocol, Interoperability via data patterns and recognition, LLMs for making visuals, General AI systems vs Language Models


Yannick Kilcher ▷ #general (27 messages🔥):

Brian Douglas's video on control theory, PID implementation for control theory projects, Airplane flap PID project, Application of control theory in research, DeepSeek article on control theory


Yannick Kilcher ▷ #ml-news (16 messages🔥):

AWS re:Invent 2025, AWS Agentic AI, Amazon Bedrock Nova Models, Nova Forge customization


HuggingFace ▷ #general (24 messages🔥):

Multi-GPU Setup, Mistral 3 Image Capabilities, AI-Generated Content on Social Media, DeepSeek v3.2 Model Implementation, Preventing Overthinking in LLMs


HuggingFace ▷ #today-im-learning (2 messages):

MOE architecture, Course Recommendations


HuggingFace ▷ #cool-finds (3 messages):

Stochastic Parrot, ODE Solver, Diffusion Model, Claude Reward Hacking, Context Collapse


HuggingFace ▷ #i-made-this (4 messages):

French Book Public Domain dataset, smallevals, STRAW (sample-tuned rank-augmented weights)


HuggingFace ▷ #reading-group (2 messages):

Perturbation-based attribution experiments, Deep vision models, Data Chunking Quality


HuggingFace ▷ #agents-course (2 messages):

Course Completion Guidance, Colab Notebook Issues


Modular (Mojo 🔥) ▷ #general (5 messages):

Community Meeting, YouTube Release Delay, Level Advancements


Modular (Mojo 🔥) ▷ #mojo (16 messages🔥):

codepoint_slices error handling, String handling differences, GPU constant memory usage, Gemini 3 Mojo understanding, Mojo stdlib proposal


aider (Paul Gauthier) ▷ #general (6 messages):

aider with distributed inference, Ollama timeout errors, llama.cpp API server


aider (Paul Gauthier) ▷ #questions-and-tips (2 messages):

aider --auto-test and --yes-always flags, Local LLMs on Mac and Aider on Fold 6


DSPy ▷ #show-and-tell (1 messages):

MCP Apps SDK, Embeddable ChatGPT Apps


DSPy ▷ #papers (1 messages):

Multi-Agent Systems, Latent Collaboration


DSPy ▷ #general (5 messages):

Student Models Subforum, Claude Code LM for DSPy, AI Engineer Introductions, Full Stack engineer


Manus.im Discord ▷ #general (6 messages):

Workflow Automation & LLM Integration, RAG Pipelines, AI Content Detection, Image AI, Voice AI


tinygrad (George Hotz) ▷ #general (3 messages):

train_step function, obs indexing performance, Variable vmin, RMSNorm parameter


tinygrad (George Hotz) ▷ #learn-tinygrad (2 messages):

tinygrad's master branch refactoring, axis_colors dict


MCP Contributors (Official) ▷ #general-wg (4 messages):

Tool Design Best Practices, UUIDs as Input, LLM creating UUIDs, list_items Tool, describe_item Tool


Windsurf ▷ #announcements (1 messages):

GPT-5.1-Codex Max, Windsurf Update