Frozen AI News archive

OpenAI Titan XPU: 10GW of self-designed chips with Broadcom

**OpenAI** is finalizing a custom ASIC chip design to deploy **10GW** of inference compute, complementing existing deals with **NVIDIA** (10GW) and **AMD** (6GW). This marks a significant scale-up from OpenAI's current **2GW** compute, aiming for a roadmap of **250GW** total, which is half the energy consumption of the US. Greg from OpenAI highlights the shift of **ChatGPT** from interactive use to always-on ambient agents requiring massive compute, emphasizing the challenge of building chips for billions of users. The in-house ASIC effort was driven by the need for tailored designs after limited success influencing external chip startups. Broadcom's stock surged 10% on the news. Additionally, **InferenceMAX** reports improved ROCm stability and nuanced performance comparisons between AMD MI300X and NVIDIA H100/H200 on **llama-3-70b** FP8 workloads, with RL training infrastructure updates noted.

Canonical issue URL

ASICs are all you need.

AI News for 10/10/2025-10/13/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (197 channels, and 15120 messages) for you. Estimated reading time saved (at 200wpm): 1127 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

There's been a lot of chip dealmaking by OpenAI recently to create "the biggest joint industrial project in human history":

and today, the final shoe drops - as widely rumored and on schedule, after hiring TPU alums from Google - 10GW of OpenAI's own ASIC and systems specifically designed for OpenAI's inference capacity (as Sam says on the OpenAI podcast).

To put this in scale, all of OpenAI has 2GW of compute now, majority spent on R&D:

and this is 12% of an overall roadmap going to 250GW (half the energy consumption of the United States)

Greg says ambient agents are a big part of the reason why inference demand will go up a lot:

But I think that we are heading to a world where AI intelligence is able to help humanity make new breakthroughs that just would not be possible otherwise.

And we're going to need just as much compute as possible to power that.

Like one example of something very concrete is that we are in a world now where ChatGPT is changing from something that you talk to interactively to something that can go do work for you behind the scenes.

If you've used features like Pulse, You wake up every morning. It has some really interesting things that are related to what you're interested in. It's very personalized. And our intent is to turn ChatGPT into something that helps you achieve your goals.

The thing is, we can only release this to the pro tier because that's the amount of compute that we have available. And ideally, everyone would have an agent that's running for them 24-7 behind the scenes, helping them achieve their goals. And so ideally, everyone has their own accelerator, has their own compute power that's just running constantly.

And that means there's 10 billion humans.

We are nowhere near being able to build 10 billion chips.

And so there's a long way to go before we are able to saturate not just the demand, but what humanity really deserves.

Greg says that they have been working on their ASIC for 18 months, and why they did this in house:

"There were all sorts of chip startups with novel approaches that were very different from GPUs. And we started giving them a ton of feedback saying, here's where we think things are going. It needs to be models of this shape. And honestly, a lot of them just didn't listen to us, right? And so it's like very frustrating to be in this position where you say we see the direction the future should be going. We have no ability to really influence it besides sort of, you know, just like sort of trying to influence other people's roadmaps. And so by being able to take some of this in-house, we feel like we are able to actually realize that vision."

While nothing yet has been announced with Intel, it is surely not far behind given the clear interest in the American AI stack.

Broadcom's stock jumped 10% (+$150B) on today's news.


AI Twitter Recap

Chips, inference TCO, and training infra


Reasoning RL: hybrid rewards, label-free scaling, and new sequence models


Multimodal models: audio reasoning SOTA and video systems


Open-source training stacks and reproducible recipes


Benchmarks and evaluation advances


Product and platform updates


Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Chinese Open-Model Dominance and LLM Style-Collapse Debate

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Video Generation Models: Wan 2.2 FLF2V (-Ellary-) and Sora Mainstreaming in Spain

2. Unitree G1 V6.0 Humanoid Agility Demo and ChatGPT Simpsons-Style Outputs

3. Minimal-caption Meme/Reaction Images (He's absolutely right / Infinite loop / Hmm)


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1. New Models, Frameworks, and APIs Launch into the Stratosphere

Theme 2. Hardware Headaches and Performance Puzzles

Theme 3. Model Quirks, Copyright Clashes, and Critical Vulnerabilities

Theme 4. Developer Tooling Troubles and Community Connections

Theme 5. Decoding the Science Behind Smarter AI


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


OpenAI Discord


LM Studio Discord


Unsloth AI (Daniel Han) Discord


OpenRouter Discord


Cursor Community Discord


GPU MODE Discord


HuggingFace Discord


Eleuther Discord


Modular (Mojo 🔥) Discord


Nous Research AI Discord


Latent Space Discord


Manus.im Discord Discord


Moonshot AI (Kimi K-2) Discord


Yannick Kilcher Discord


DSPy Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


MCP Contributors (Official) Discord


MLOps @Chipro Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1266 messages🔥🔥🔥):

Perplexity Pro vs Gemini Ultra, Perplexity AI music industry acquisition, OpenAI and Google's Approach to AI


Perplexity AI ▷ #pplx-api (8 messages🔥):

Permission Denied Error, Cloudflare WAF protections, Anti-spam detectors


LMArena ▷ #general (1249 messages🔥🔥🔥):

GPT-5, Gemini 3, Claude 4.5, AI Model Performance, Comet Browser


OpenAI ▷ #ai-discussions (1084 messages🔥🔥🔥):

ChatGPT image generation, Sora 2 restrictions, AI and copyright infringement, The agony of Eros, Getting Sora codes


OpenAI ▷ #gpt-4-discussions (37 messages🔥):

MCP dev channel, ChatGPT solves crossword, Sora AI for Android, GPT realtime model training, custom gpt plus vs free


OpenAI ▷ #prompt-engineering (175 messages🔥🔥):

Sora Prompting, Context Poisoning, Psychological Safety, Quantum Superpositioning, Text-to-Video Prompt Tool


OpenAI ▷ #api-discussions (175 messages🔥🔥):

Sora Prompting, Context Poisoning, Psychological Safety in AI, Fnords and Prompt Engineering


OpenAI ▷ #api-projects (32 messages🔥):

Pronoun Debate, Project Sharing Interrupted, Leftist Accusations


LM Studio ▷ #general (736 messages🔥🔥🔥):

MacOS Tahoe High GPU Usage, Copilot Integration in VSCode, LM Studio Context Amount Display Issue, NVIDIA K80 GPU Opinions, Vibe Coding


LM Studio ▷ #hardware-discussion (110 messages🔥🔥):

VRAM usage, Qwen3 performance, Flash attention, RAM pricing, Local LLM setup


Unsloth AI (Daniel Han) ▷ #general (415 messages🔥🔥🔥):

Model benchmaxxing techniques, trl.experimental update, Huggingface token 401 error, Duck Reasoning Puzzle, AI specific CVEs


Unsloth AI (Daniel Han) ▷ #introduce-yourself (3 messages):

AI developer introduction, Software Engineer introduction, Android phone finetuning


Unsloth AI (Daniel Han) ▷ #off-topic (132 messages🔥🔥):

GPU Parallelism, LLM self-hosting, GPTs and Tables, GPT Model Sizes and Capabilities, Fine-tuning for specific personas


Unsloth AI (Daniel Han) ▷ #help (209 messages🔥🔥):

AI Code Agent for test case generation, Gemma 1B fine-tuning speed, Qwen3-0.6B OOM issues with Llamafactory, Unsloth DGX Spark compatibility, Tokenizer issues after adding tokens and resizing


Unsloth AI (Daniel Han) ▷ #showcase (4 messages):

Qwen3-8B finetuning, Novel training data, Epoch quantity


Unsloth AI (Daniel Han) ▷ #research (62 messages🔥🔥):

Data Augmentation for gaming, HRM systems, GNN trained, Nemotron Math Human Reasoning


OpenRouter ▷ #app-showcase (2 messages):

AI Roleplay Site, Free Requests, OpenRouter Powered


OpenRouter ▷ #general (606 messages🔥🔥🔥):

Google reallocates servers, OpenRouter AI SDK, Chinese models are lenient, Deepseek 3.1 is censored, LayerFort is a scam


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (71 messages🔥🔥):

Qwen model releases, Groq's Performance, AI upscaling concerns, Gemini 3 Release, OpenRouter UI/UX feedback


Cursor Community ▷ #general (442 messages🔥🔥🔥):

Terminal tagging in chat, Cursor on mobile, AI coding feedback, Background Agents & costing per prompt, Max mode error


Cursor Community ▷ #background-agents (7 messages):

Linear Integration, Background Agents, Cursor Agent Shutdown, Cursor Not Responding


GPU MODE ▷ #general (35 messages🔥):

Nvidia 50 series profiling, CUDA repo contributions, Image embedding optimization, PTX ISA, Position encodings in LLMs


GPU MODE ▷ #triton (3 messages):

Triton Community Meetup, TLX Updates, Triton + PyTorch Symmetric Memory, Triton Flex Attention in PyTorch, Intra-thread Data Exchange Algorithm


GPU MODE ▷ #cuda (12 messages🔥):

CUDA Core Assignment, mbarrier Usage, DSMEM Synchronization


GPU MODE ▷ #torch (2 messages):

Torch Compiled Model Memory Leak, CUDA Memory Defragmentation


GPU MODE ▷ #announcements (1 messages):

llmq, quantized LLM training, CUDA


GPU MODE ▷ #cool-links (3 messages):

ATLAS LLM Inference, VectorDB in Go, Hybrid Retrieval Methods


GPU MODE ▷ #jobs (2 messages):

GPU Performance Engineer, Reinforcement Learning for Vision-Language Models


GPU MODE ▷ #beginner (26 messages🔥):

axpy.cu compilation error, libwb installation, GPU learning resources, GPU compiler optimizations vs ML


GPU MODE ▷ #jax (1 messages):

JAX, Pallas, GPU compute/comms overlap, NVLINK comms


GPU MODE ▷ #off-topic (3 messages):

ML Systems Breakfast, Stanford ML Meetup


GPU MODE ▷ #irl-meetup (1 messages):

Approval Request, Private Repos Sharing, Kernel Writing Basics


GPU MODE ▷ #rocm (2 messages):

Composable Kernel build failure, Missing header in composable kernel


GPU MODE ▷ #self-promotion (4 messages):

LRU, LFU, C4ML, TensorFlow Optimizers, Blockchains


GPU MODE ▷ #avx (1 messages):

Intel SDE, Intrinsics and immintrin_dbg.h


GPU MODE ▷ #thunderkittens (7 messages):

CUDA toolkit versions for Blackwell, GH200 Hopper machine, CUDA requirements changing, Compiling errors, Narrowing conversion errors


GPU MODE ▷ #reasoning-gym (1 messages):

Weights and Biases (wandb) Logs, GRPO policy loss clipping, Reasoning Gym


GPU MODE ▷ #submissions (130 messages🔥🔥):

MI300x8 Leaderboard Updates, amd-ag-gemm Performance, amd-all2all Performance, amd-gemm-rs Performance


GPU MODE ▷ #status (14 messages🔥):

Runner Timeouts, Deadline Extension Controversy, AMD's Node Limitations


GPU MODE ▷ #tpu (1 messages):

rybchuk: you need to do jax distributed init first


GPU MODE ▷ #factorio-learning-env (2 messages):

Factorio Crime Scene, Game Neglect Consequences


GPU MODE ▷ #amd-competition (105 messages🔥🔥):

Timeout Errors, GPU Memory Access Faults, Submission Queue Overload, Debugging Prints in Submissions, Stream Events for Measuring Kernel Time


GPU MODE ▷ #cutlass (14 messages🔥):

CuTeDSL caching, MoE Group GEMM, Group GEMV, Proton Viewer


GPU MODE ▷ #singularity-systems (4 messages):

picograd, SITP, tinygrad, autodiff, Triton kernels


GPU MODE ▷ #general (9 messages🔥):

VSCode Extension, GPU Mode Website Tutorials, Submitting Kernels, PMPP v2 Problem, Grayscale Submission


GPU MODE ▷ #multi-gpu (25 messages🔥):

VAE Training on Multi-GPU, Serving LLMs for Multiple Users, Nvidia 5090 features


GPU MODE ▷ #opencl-vulkan (5 messages):

Gallium3D compute driver on top of CUDA, Rusticl on Zink on NVK vs NVIDIA Proprietary OpenCL, Vulkan API, VK_KHR_shader_fma


GPU MODE ▷ #penny (1 messages):

vllm oneshot, small buffers, PR 2192


GPU MODE ▷ #llmq (12 messages🔥):

Weird Quantizations, LoRA Training, Model Implementation, LLM.q Talk


GPU MODE ▷ #helion (5 messages):

FLA Benchmark, GDN, Mamba2, PTC Talk, Backward Generation


HuggingFace ▷ #general (238 messages🔥🔥):

Open Source MoE, Fine-tuning Florence, Upscaling Images, LayerFort Spam, Hugging Face Refunds


HuggingFace ▷ #i-made-this (75 messages🔥🔥):

Declarative Web App Management, Serverless Agent Platform, Contrarian Research Model, TensorFlow Optimizers, AI Image Analysis


HuggingFace ▷ #computer-vision (4 messages):

Computer Vision Hangout Slides, AI Image Analysis Tool, GenAI Meetup in San Francisco


HuggingFace ▷ #smol-course (5 messages):

Hugging Face Jobs Errors, SmolAgents Tool Calling Agent Issues, Connecting Database Info to DeepSite, DPO Quiz Errors


HuggingFace ▷ #agents-course (3 messages):

AI Agents Certificates


Eleuther ▷ #general (70 messages🔥🔥):

Neural Theorem Proving Channel, AI Evaluation Strategies, Smaller, More Efficient Models, Smallest Model Definition, GPT-3 API Startups


Eleuther ▷ #research (136 messages🔥🔥):

Scalar RMSProp adaptivity, Anti-Scalar RMSProp, Mamba 3 Architecture, RWKV-7 comparison, Less is More Recursive Reasoning


Modular (Mojo 🔥) ▷ #general (19 messages🔥):

GPU Recompilation, MLIR blobs for drivers, Vulkan Bindings, AI-driven video streaming


Modular (Mojo 🔥) ▷ #announcements (1 messages):

October Community Meeting, FFT implementation in Mojo, MAX backend for PyTorch, Modular's 25.6 release, Unifying GPUs


Modular (Mojo 🔥) ▷ #mojo (136 messages🔥🔥):

ComplexSIMD constructor, FFTW port, Mojo on ARM, LayoutTensors pain points, Mojo tutorials


Modular (Mojo 🔥) ▷ #max (6 messages):

Bazel hackery in Modular tests, Testing acos() Max op


Nous Research AI ▷ #general (133 messages🔥🔥):

vllm predicted outputs, Sam Altman, MCP gateway by docker, AI evaluations, decentralized ai and its security


Nous Research AI ▷ #ask-about-llms (6 messages):

Graph Rag-like approach, Role-play book chunking, Wikipedia scratchpad, Gemini summarizes reply


Nous Research AI ▷ #research-papers (6 messages):

LoRA RL, Self-Adapting LLMs (SEAL), GRPO Algorithm, Weight Updates


Nous Research AI ▷ #research-papers (6 messages):

LoRA RL, Self-Adapting LLMs (SEAL), GRPO RL Algorithm


Latent Space ▷ #ai-general-chat (120 messages🔥🔥):

Exa Search API v2.0, Unlimited Claude via GLM-4.6 Reverse-Engineering, Raindrop’s AI Agent A/B Testing, Base Models Reasoning Skills, RWKV-8 ROSA Architecture


Latent Space ▷ #private-agents (1 messages):

diogosnows: Appreciated <@1203156838409969675> 🙏


Latent Space ▷ #genmedia-creative-ai (12 messages🔥):

Nano Banana Soul Mark Debate, Nano-Banana Pencil & Ink AI Sketches, LinusEkenstam Nano Banan Ink-Sketch Prompt, AI-generated Watermarks, AI-generated Pencil Sketches


Manus.im Discord ▷ #general (111 messages🔥🔥):

Custom Domain for Manus, Manus Going Rogue, Manus API Validation Error, Manus Webhook Issue, Is Manus Hiring?


Moonshot AI (Kimi K-2) ▷ #general-chat (70 messages🔥🔥):

AI-generated anime, Groq's performance, OAuth integration for Moonshot, Model Benchmarking, Aspen's abscence


Yannick Kilcher ▷ #general (46 messages🔥):

Graph Neural Networks, Hyperparameter Tuning, LR Scheduling, Context Windows, Embedding Swapping


Yannick Kilcher ▷ #paper-discussion (3 messages):

Segment Anything 3, New Arxiv Paper


Yannick Kilcher ▷ #agents (7 messages):

LLM Agents Course, Berkeley Webcast Subtitles, Federal Law Requirements


Yannick Kilcher ▷ #ml-news (10 messages🔥):

AI First Authorship Requirement, Copilot Vulnerability, ImageNet image generation, Prompt Injection


DSPy ▷ #show-and-tell (7 messages):

Newsletter service, Optimization of small LLMs (Gemma), DSPy optimizers (Bootstrap fewshot vs GEPA)


DSPy ▷ #papers (2 messages):

New Arxiv papers


DSPy ▷ #general (34 messages🔥):

Multi-Modal Models, Liquid Models, DSPy Boston Meetup, DSPy Bay Area Meetup, DSPy Toronto Meetup


aider (Paul Gauthier) ▷ #general (14 messages🔥):

Aider configuration for default prompt function, AIder Polyglot benchmark LLM evaluation trajectories, Discussion platform for Aider (Github discussions, Reddit forum), Exporting Aider settings to a file


aider (Paul Gauthier) ▷ #questions-and-tips (7 messages):

Aider .env file locations, Aider vs other CLI tools, Aider fixing bad code, Auto test config, OpenAI endpoint to ChatGPT


tinygrad (George Hotz) ▷ #general (17 messages🔥):

Python 3.11 Upgrade, TinyMesa CPU, TinyMesa Building on Mac, NVIDIA GPU on Mac, Meeting Cancellation


MCP Contributors (Official) ▷ #general (9 messages🔥):

Proxying REST API, LLM-ready APIs, MCP server packaging formats, MCPB repo, Cloudflare MCP


MCP Contributors (Official) ▷ #general-wg (1 messages):

jzhukovs: Does anyone know if Google AI studio supports MCP? Doesn’t look like it.


MLOps @Chipro ▷ #events (1 messages):

Diffusion Model Paper Reading Group, DDIM Paper Discussion, Diffusion & LLM Bootcamp