Frozen AI News archive

GPT-5 Codex launch and OpenAI's quiet rise in Agentic Coding

**OpenAI** released **GPT-5-Codex**, an agentic coding model optimized for long-running software engineering tasks with dynamic task-adaptive thinking, multi-hour autonomy, and improved code quality. It achieves 51% accuracy on an unreleased large refactor benchmark and integrates deeply with developer tools like Xcode. Meanwhile, **Alibaba** launched **Qwen3-Next-80B**, a hybrid MoE model with native long-context support (262k tokens, extensible to 1M+), targeting efficient reasoning and repository-scale code analysis, supported by **Together AI** and **NVIDIA** with CUDA-accelerated attention. The trend towards hybrid SSM + MoE architectures is noted, emphasizing efficiency and scaling in China and US training regimes. Community discussions highlight the importance of variable compute and routing for inference efficiency and quality.

Canonical issue URL

Codex is all you need?

AI News for 9/12/2025-9/15/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (192 channels, and 11857 messages) for you. Estimated reading time saved (at 200wpm): 1016 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Just like we covered the quiet rise of Claude Code in June, today is one of those days that ordinarily wouldn't quite qualify for a title story, but the cumulative impact of a month's worth of hype of increasing sentiment for GPT 5 and OpenAI's Codex (an answer to Claude Code, but with a lot more breadth) is worth flagging, and is given extra juice in today's release from OpenAI. This is best covered in our sister publication. If you were a heavy Codex user, note also the pitfalls flagged in the Discord section.


AI Twitter Recap

OpenAI’s GPT-5-Codex and the agentic coding race

Qwen3‑Next 80B (A3B MoE), long-context, and the China efficiency push

Tooling for agents: MCP everywhere, Claude Code SDK, and workflow “vibe coding”

RL for reasoning and agents: online RL in product, deep research agents, and new training regimes

Multimodal and computer-use models

Systems and infra (throughput, routing, and deployment)

Top tweets (by engagement, AI/engineering)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. DIY 8x AMD MI50/MI60 Rig + Open-Source Mobile Agent AndroidWorld #1

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

TO BE COMPLETED


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. Agentic Coding Upgrades & Workflows

2. Datasets & Personalizable Speech

3. Model Ecosystem: Mobile, Norms, Deprecations

4. GPU Systems, Attention Kernels & Memory Models

5. Funding & Infra Debates


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


OpenRouter Discord


Cursor Community Discord


Eleuther Discord


GPU MODE Discord


LM Studio Discord


Moonshot AI (Kimi K-2) Discord


HuggingFace Discord


Yannick Kilcher Discord


Latent Space Discord


Nous Research AI Discord


DSPy Discord


Modular (Mojo 🔥) Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


Manus.im Discord Discord


MCP Contributors (Official) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1166 messages🔥🔥🔥):

Exporting searches to PDF on iOS, Sonar AI performance, Grok Heavy worth, Perplexity Focus on AI search engines, GPT-5 Release


Perplexity AI ▷ #sharing (23 messages🔥):

Shareable Threads, Referral Links, Collections by Sameer


Perplexity AI ▷ #pplx-api (2 messages):

Sonar API, API Credits


LMArena ▷ #general (862 messages🔥🔥🔥):

MoE Models, RLHF, Grok's censorship, Taiwan censorship, LongCat model


LMArena ▷ #announcements (2 messages):

New Model: Seedream-4-high-res, LMArena User Preferences


Unsloth AI (Daniel Han) ▷ #general (1137 messages🔥🔥🔥):

Qwen3 performance, MLX support for Qwen3, LLama.cpp optimization, MobileLLM non commercial usages, LLM finetuning


Unsloth AI (Daniel Han) ▷ #introduce-yourself (2 messages):

Introductions, Baby Yoda memes


Unsloth AI (Daniel Han) ▷ #off-topic (560 messages🔥🔥🔥):

Google locks down AOSP, vLLM OOM, Qwen3-30B-A3B FP4, CSM Lora FT, LLaMa CPP


Unsloth AI (Daniel Han) ▷ #help (176 messages🔥🔥):

Model Merging with 16-bit Model, Qwen3 Lora Finetuning, Llama3.2 data augmentation, GPT-OSS GRPO native support, GGUF format conversion


Unsloth AI (Daniel Han) ▷ #showcase (49 messages🔥):

Embedding Gemma ONNX Quantization, Phi-4-Reasoning-Plus Unsloth on Replicate, NeuroPilot Education Platform, AI and Focus, OpenHelix Dataset Quality


Unsloth AI (Daniel Han) ▷ #research (29 messages🔥):

Synthetic Data in LLM Training, Gemma 3 Performance, AI Detection Reliability, MetaX C550 GPUs, Spiking Networks vs Transformers


OpenAI ▷ #annnouncements (2 messages):

Codex, GPT-5-Codex, AMA


OpenAI ▷ #ai-discussions (840 messages🔥🔥🔥):

OAI academy transcript tool, Qwen-code vs Qwen-coder, ChatGPT age calculation, AI and capitalism, AI and class structure


OpenAI ▷ #gpt-4-discussions (20 messages🔥):

Moral Reasoning in LLMs, GPT and Moral Frameworks, GPT builder revenue-share, Custom GPTs for Hugging Face


OpenAI ▷ #prompt-engineering (53 messages🔥):

Workflow use case variation, Prompt engineering using steps and semi-programming language, ElevenLabs conversation agents handle the system prompt, Breaking GPT5, Dynamic context


OpenAI ▷ #api-discussions (53 messages🔥):

Prompt Engineering Workflows, Vector Usage by LLMs, Dynamic System Prompts, Character Limit in Prompt Chat-box, Breaking GPT-5


OpenRouter ▷ #announcements (1 messages):

grok-2-1212, grok-2-vision-1212, grok-3, grok-4, model deprecation


OpenRouter ▷ #app-showcase (2 messages):

Agentic Automation, Model effectiveness, Overclock Work


OpenRouter ▷ #general (808 messages🔥🔥🔥):

Gemini 2.5 Pro Chat Issues, AI for Health Concerns, Skyrim Mod Error 401, Gemini API Free Daily Credits, OpenRouter Charges


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (16 messages🔥):

Unstable API, OpenRouter vs Alternatives, Providers Claiming OpenRouter Access, LLM Arena Oceanstone Speculation, ChatGPT Usage Privacy Analysis


Cursor Community ▷ #general (483 messages🔥🔥🔥):

Cursor's Linter Errors, GPT-5 Output Changes, Terminal Instances Hanging, Auto Mode Billing, OpenAI Platform UI Changes


Cursor Community ▷ #background-agents (1 messages):

Docker permissions for agent users, Manual VM setup


Eleuther ▷ #general (339 messages🔥🔥):

Low Bit Pythia, TinyStories Warmup, Muon Optimizer, RoPE analysis, MXFP4 Quantization


Eleuther ▷ #research (90 messages🔥🔥):

Gauss Lean code, scaling inference tokens, Fractured Entanglement, Neuron Lifespan


Eleuther ▷ #scaling-laws (2 messages):

New NN Architectures, New Chip Architectures, New MoE Architectures, PyTorch, Novel Infra


Eleuther ▷ #lm-thunderdome (29 messages🔥):

Model Calibration, AI Safety Concerns, Few-Shot Evaluation, BLIMP Benchmark Issue, Verifiable Rewards


GPU MODE ▷ #general (10 messages🔥):

Memory Bandwidth Bounds, CUDA Dynamic Parallelism, Valuable Training Data


GPU MODE ▷ #cuda (26 messages🔥):

PTX SASS Compilation, cuModuleLoadDataEx performance, Flash attention optimization


GPU MODE ▷ #torch (11 messages🔥):

Kernel Registration, Custom Ops, Torch Function Optimization, Ops Fusion for torch.compile


GPU MODE ▷ #cool-links (1 messages):

PTX, CUDA PTX Introduction


GPU MODE ▷ #jobs (5 messages):

AI Infra Startup Hiring, Red Hat AI Hiring, Zig for AI


GPU MODE ▷ #beginner (10 messages🔥):

CUDA, RAPIDS, CUDA-X, Batch Gradient Descent, Nvidia Jetson


GPU MODE ▷ #torchao (1 messages):

autoquant_v2, batch size 1, runtime errors, autotune stage, dtypes


GPU MODE ▷ #rocm (7 messages):

Iris Lecture, Symmetric Memory, RDMA support, iris.load/store, tl.load/store


GPU MODE ▷ #intel (18 messages🔥):

Intel CPU/GPU Optimizations, IPEX Deprecation, SGLang AMX Kernel, PyTorch integration


GPU MODE ▷ #metal (11 messages🔥):

Metal Flash Attention Bridge, Quantised Attention, Metal Command Buffer Timeout


GPU MODE ▷ #self-promotion (9 messages🔥):

LLM Negotiation Protocol, Metal Flash Attention Swift Adapter, Rust Bindings vs cv2, CuTe Partitions analysis, Gated Attention


GPU MODE ▷ #edge (2 messages):

Smallest model above GPT3.5, Quantization, VRAM requirements


GPU MODE ▷ #submissions (77 messages🔥🔥):

MI300x8 Leaderboard, Rank + 1 Trick, AMD Rules Clarification, all2all vs gemm+rs kernels, kernel dev


GPU MODE ▷ #status (11 messages🔥):

MI300x server status, Popcorn-cli timeout issues, Queue overload, Runner downtime, Cluster capacity issues


GPU MODE ▷ #factorio-learning-env (2 messages):

Eval Infra, PR Review


GPU MODE ▷ #amd-competition (55 messages🔥🔥):

Runner Queues and AMD Assistance, amd-gemm-rs Challenge Release, ROCm/iris Integration, PyTorch Version Compatibility, Clarification on amd-all2all


GPU MODE ▷ #cutlass (10 messages🔥):

CuTeDSL swizzle patterns, PTX docs discrepancies, TF32 datatype


GPU MODE ▷ #singularity-systems (11 messages🔥):

ML and Systems Book Design, GPU access limitations, Autograd machinery development, PicoTorch revitalization, Textbook to lower barriers for community


GPU MODE ▷ #general (8 messages🔥):

Kernel Development Path, GPU Mode Kernel Competition, Triton Benchmarks, BioML Trimul Kernel Competition


GPU MODE ▷ #multi-gpu (1 messages):

``


GPU MODE ▷ #low-bit-training (2 messages):

Video Models, Low-bit-training, GPU mode hackathon


GPU MODE ▷ #irl-accel-hackathon (37 messages🔥):

Multi modal inference, Training Optimisation, Gated DeltaNet, Sparse GNN ideas, Low-bit-training


LM Studio ▷ #general (180 messages🔥🔥):

Playwrite MCP issues, Local Wikipedia Access for Small Models, Qwen/Qwen3-Next-80B-A3B-Instruct and llama.cpp, SIA-1: Self Improving AI Agent, lambda stack vs lm studio


LM Studio ▷ #hardware-discussion (113 messages🔥🔥):

KiCad and LLMs for Circuit Design, SBC for Searxng vs Obsidian, GPT-OSS-20B and VRAM Allocation, Nvidia P40 EOL, RTX 5070 and LLM Performance


Moonshot AI (Kimi K-2) ▷ #general-chat (265 messages🔥🔥):

Kimi vs GPT-5, Augment code extension, Kimi K2 Groq, interactive preview for LLM processes, API Keys vs Login Accounts


HuggingFace ▷ #general (115 messages🔥🔥):

HDF5 Python Library, FineWeb pretraining, Hugging Face Spaces storage, Qwen3-Next modeling and Layernorms, Models for open world RPG RP


HuggingFace ▷ #today-im-learning (6 messages):

Agents Course, smol course, MCP course, LoRA finetuning, Transformers architecture


HuggingFace ▷ #cool-finds (2 messages):

HF models, Fine-tuned models


HuggingFace ▷ #i-made-this (13 messages🔥):

Voxtral finetuning, Dialectical Agentic CrossSphere AI, Refrag Efficient LLM Compression, Image to Space


HuggingFace ▷ #computer-vision (2 messages):

Style Transfer, WCT2 Methods, Segmented Images


HuggingFace ▷ #NLP (3 messages):

Qwen2.5-72B fine-tuning, Database for Chat History, Maintaining User Sessions


HuggingFace ▷ #smol-course (9 messages🔥):

Fine-tuning course details, VRAM concerns for smaller models, In-person study group in NYC, Leaderboard evaluation for custom use cases, smol-course


HuggingFace ▷ #agents-course (11 messages🔥):

Agent course introductions, Token setting rookie mistake, Unit one introductions


Yannick Kilcher ▷ #general (85 messages🔥🔥):

Trade Unions and Fascism, LLMs and Bayesian Inference, AI and Topos Theory, Positional Encoding in Transformers, Deep Learning and Turbulent Flow


Yannick Kilcher ▷ #paper-discussion (19 messages🔥):

Spiking Brain-inspired Large Models, Anthropic's Research, OpenAI's Research, Decreasing Work-Related Usage, Noise Among Older Cohorts


Yannick Kilcher ▷ #agents (3 messages):

``


Yannick Kilcher ▷ #ml-news (15 messages🔥):

MobileLLM-R1-950M Release, AI Alignment, AI Constitutional Assembly, Cloud Providers Profiting, NVIDIA


Latent Space ▷ #ai-general-chat (101 messages🔥🔥):

MBA-ification of Startups, AI texting concierge poke.com, OpenAI Model Spec Update, Naveen Rao leaves Databricks, GPT-5 ‘High New’


Latent Space ▷ #genmedia-creative-ai (10 messages🔥):

Higgsfield $50M raise, Adobe value shift, GenZ AI Founders


Nous Research AI ▷ #general (74 messages🔥🔥):

Nepal Discord Election, MLC-LLM issues, sglang vs vllm, GPT-OSSH4 in claude code, Demis Hassabis


Nous Research AI ▷ #ask-about-llms (1 messages):

Adversarial Idea Presentation, Strength in Weakness


Nous Research AI ▷ #research-papers (5 messages):

OpenAI Economic Research, Anthropic Economic Index, ChatGPT usage growth, AI Friend mapping


Nous Research AI ▷ #interesting-links (2 messages):

DNS Tunneling Chat Client, AI Killing Focus


Nous Research AI ▷ #research-papers (5 messages):

OpenAI, Anthropic, AI Usage, AI Friend


DSPy ▷ #show-and-tell (3 messages):

fastWorkflow beats Claude Opus 4.1, GEPA API Improvement, Tau Bench retail


DSPy ▷ #general (51 messages🔥):

GEPA for code generation, Manim and DSPy video, Rules as inputs for optimization, MCP Server, Zero Shot Categorization


DSPy ▷ #colbert (1 messages):

Contextual Chunking, ColBERT Models, Late Chunking, MaxSim Algorithm


Modular (Mojo 🔥) ▷ #general (18 messages🔥):

Mojo Package Managers, Binary vs Source Distribution, Pixi and Conda, Apple M1 Compatibility


Modular (Mojo 🔥) ▷ #mojo (33 messages🔥):

InlineList Removal, Small List Optimization, Allocator/Storage API, Mojo LSP Status, Network update


aider (Paul Gauthier) ▷ #general (36 messages🔥):

RepoMap for Aider, Free C# Models, AGI Predictions, LM Studio issues, GPT-5 Codex


aider (Paul Gauthier) ▷ #questions-and-tips (6 messages):

Ollama context window limits not respected, lm studio or llamafile suggestion, --watch-files implementation on Linux, Gemini issues with Aider


aider (Paul Gauthier) ▷ #links (1 messages):

Earning $100k in a week, Telegram scams


tinygrad (George Hotz) ▷ #general (25 messages🔥):

Tensor.assign return value, GEMM TFLOPs measurement, Winograd bounty lock, Rangeify bugs, CUDA 12.0 and sm_35


tinygrad (George Hotz) ▷ #learn-tinygrad (12 messages🔥):

GPU Utilization in tinygrad, VIZ=1 Profiler, NixOS Patch for CUDA, Profiler 404 error


Manus.im Discord ▷ #general (19 messages🔥):

Credits Rollover, Daily Credits Stopped, Clone Website using AI, Subscription Renewal Issues, Knowledge Limit Increase


MCP Contributors (Official) ▷ #general (10 messages🔥):

MCP Servers, Reinforcement Learning, Integration Testing, MCP Server Efficiency, NL Interface


MCP Contributors (Official) ▷ #general-wg (2 messages):

MCP Resource Integration with LLMs, Claude Desktop Automation, Discord Channel Restrictions