Frozen AI News archive

not much happened today

**OpenAI** continues small updates to **GPT-5**, introducing "Auto/Fast/Thinking" modes with **196k token context**, **3,000 messages/week**, and dynamic routing to cheaper models for cost efficiency. The **MiniMax AI Agent Challenge** offers **$150,000** in prizes for AI agent development by August 25. The community discusses **GPT-OSS-120B** base model extraction, hosting, and tooling improvements, including multi-tool pipelines and flex-attention. **Anthropic** announces model pairing in **Claude Code** with **Opus 4.1** for planning and **Sonnet 4** for execution, expanding context to **1M tokens** and introducing prompt caching. Key figures include *@sama*, *@jeremyphoward*, *@jxmnop*, and *@_catwu*.

Canonical issue URL

a quiet day

AI News for 8/12/2025-8/13/2025. We checked 12 subreddits, 544 Twitters and 29 Discords (227 channels, and 8451 messages) for you. Estimated reading time saved (at 200wpm): 696 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Small updates to GPT5 continue (see Twitter Recap).

Since it's quiet, why not hack on an agent and compete for $150k in cash prizes with our friends at MiniMax (of MiniMax-M1 fame)?


🚀 $150,000 MiniMax AI Agent Challenge — Bring Your A-Game!


AI Twitter Recap

OpenAI GPT-5 product updates, routing economics, and evals

GPT‑OSS: base model extraction, hosting, and low‑level tooling

Anthropic: Opus‑plan/Sonnet‑execute, 1M context, prompt caching, Humanloop

DSPy 3.0 and the rise of prompt/black‑box optimizers

Open models, toolchains, and leaderboards (Qwen, GLM, qqWen, Kimi, Mistral)

Agents, evaluation, and infra debugging

Applied product launches: Perplexity Comet and Finance, plus multimodal video tools

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Qwen Model Real-World Local Usage Reports

2. gpt-oss-120B Model Benchmarks and Limitations

3. Nano-banana Text-to-Image Model Launch

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. OpenAI GPT-5 & ChatGPT Model Picker and Feature Updates

2. Gemini and Wan 2.2 Model Launches and Usage Insights

3. AI Identity & Privacy: Faceseek and Facial Recognition Debate


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

1. The GPT-5 Saga: New Features, Pricing Debates, and Performance Quirks

2. New Models on the Block: From Open Source Upstarts to Proprietary Powerhouses

3. The Developer's Toolkit: Frameworks, Libraries, and Persistent Memory

4. Under the Hood: GPU Performance, Hardware Bottlenecks, and Low-Level Hacks

5. Community Frontiers: API Woes, Research Debates, and User Frustrations


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


Cursor Community Discord


HuggingFace Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Latent Space Discord


GPU MODE Discord


Moonshot AI (Kimi K-2) Discord


Nous Research AI Discord


Eleuther Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


DSPy Discord


Notebook LM Discord


aider (Paul Gauthier) Discord


Cohere Discord


LlamaIndex Discord


Manus.im Discord Discord


tinygrad (George Hotz) Discord


MCP (Glama) Discord


MLOps @Chipro Discord


LLM Agents (Berkeley MOOC) Discord


Nomic.ai (GPT4All) Discord


The Torchtune Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI â–· #announcements (1 messages):

Comet Availability, Perplexity AI, US-Based Users


Perplexity AI ▷ #general (1351 messages🔥🔥🔥):

Comet Browser, Grok 4, AI Generated images, parameters of Models, Gemini vs GPT-5


Perplexity AI â–· #sharing (4 messages):

Chrome perplex bid, AI/ML weekly, Comet projects, Spotify playlists


Perplexity AI â–· #pplx-api (1 messages):

web_search_options parameters


LMArena ▷ #general (1080 messages🔥🔥🔥):

GPT-5 performance, Nano Banana image model, Grok vs GPT-5, Vitamin D3 dosage, Gemini 3 release


LMArena â–· #announcements (1 messages):

July Contest, Contest Voting, Next Contest


Unsloth AI (Daniel Han) ▷ #general (867 messages🔥🔥🔥):

Local vs OSS models, lora importance, Mistral struggles, GGUF quants


Unsloth AI (Daniel Han) â–· #introduce-yourself (5 messages):

greetings, channel spamming, server settings


Unsloth AI (Daniel Han) ▷ #off-topic (33 messages🔥):

Old CUDA Drivers, NVIDIA RTX 5090D, Hann Window, T4 ewaste, Base Model Extraction from LoRA


Unsloth AI (Daniel Han) ▷ #help (104 messages🔥🔥):

Llama-3.2-3B-Instruct finetuning error, Qwen3 4B Instruct model support, gradient_accumulation_steps>1, quantization of models, tool call JSON output


Unsloth AI (Daniel Han) â–· #showcase (1 messages):

New Reasoning Dataset, OpenHelix-R-100k


Unsloth AI (Daniel Han) â–· #research (4 messages):

Transformer Architecture Diagrams, Synthetic Data Generation


OpenAI ▷ #ai-discussions (771 messages🔥🔥🔥):

GPT-5, Codex in ChatGPT, Google Drive in Plus, Legacy Models, GPT-5 limitations


OpenAI ▷ #gpt-4-discussions (14 messages🔥):

GPT-5 Auto Users, GPT-5 Temperature, GPT Chain Reset, Model Switching


OpenAI ▷ #prompt-engineering (22 messages🔥):

GPT command titles, Rule priority for AI, Unique tokens for attention, Positive vs. negative prompts, Permanent memory for ChatGPT


OpenAI ▷ #api-discussions (22 messages🔥):

GPT Prompting for Urgent Rules, Prompting for AI voice, GPT-5 Customization and Permanent Memories, LLM token attention, Positive Prompts over Negative Prompts


Cursor Community ▷ #general (887 messages🔥🔥🔥):

GPT-5 pricing and availability, Cursor pricing changes, 1M context window Claude Sonnet, Using Cursor CLI vs IDE, Alternative AI tools


Cursor Community â–· #background-agents (6 messages):

Background Agents, Monorepo Setup, Docker Build, API Access


HuggingFace ▷ #general (356 messages🔥🔥):

Llama.cpp and CUDA, Gemini Flash Lite for video, Hugging Face Server Tag, AI Ethics in AMA, Xet and HF Hub Integration


HuggingFace â–· #today-im-learning (2 messages):

Fastai deep learning course, Train model on diffusers


HuggingFace â–· #cool-finds (1 messages):

tonic_1: https://snwy.substack.com/p/building-a-bigger-qwen-out-of-two


HuggingFace ▷ #i-made-this (11 messages🔥):

Track-Tonic advice, TalkT-pro model, Prioritized Experience Replay, Personal finance dataset, GPT's Byte Pair Encoding


HuggingFace â–· #smol-course (1 messages):

Smolagent code, Code agentic approach


OpenRouter (Alex Atallah) â–· #app-showcase (3 messages):

The Last RAG (TLRAG), NoChain Orchestrator, Statelessness & Digital Amnesia, Persistent Identity, Token Costs


OpenRouter (Alex Atallah) ▷ #general (262 messages🔥🔥):

Sonnet update, tool use and structured output with open source models, GPT-5 performance, Gemini 3 as a disappointment, OpenRouter Image resizing


OpenRouter (Alex Atallah) â–· #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter (Alex Atallah) ▷ #discussion (8 messages🔥):

GPU Rentals, AI TLD issues, Chatroom Caching


LM Studio ▷ #general (166 messages🔥🔥):

Context Length, RTX 6000 Pro, DGX Spark, LM Studio on Lenovo Legion Go, LM Studio and RDP


LM Studio ▷ #hardware-discussion (79 messages🔥🔥):

LMStudio GPU Usage, RTX 3050 Configuration, CUDA vs Vulkan Runtimes, MoE Model Performance, AMD iGPU Optimization


Latent Space ▷ #ai-general-chat (131 messages🔥🔥):

Azure/AWS vs Startups Benchmark Degradation, Fireworks account suspension, Mistral Medium 3.1 Release, GPT-OSS-20B Base Model Extraction, Cobot Beta Launch


GPU MODE ▷ #general (36 messages🔥):

llama.cpp, 0xc0000409 exception, llama_model_load_from_file, CUDA backend, STATUS_STACK_BUFFER_OVERRUN


GPU MODE â–· #cuda (5 messages):

cuda_fp6.h, cuda_fp4.h, cuda math API, AOT compile Triton kernel, Rust inference engines


GPU MODE ▷ #torch (12 messages🔥):

DTensor, FSDP regressions, autograd issues, full_tensor tracking, linear cross-entropy


GPU MODE â–· #beginner (5 messages):

CUDA/C++ files, submission bot, vectorsum_v2, github reference kernels


GPU MODE â–· #triton-puzzles (5 messages):

Triton Puzzle Notebook issues, tritonviz incompatibility


GPU MODE â–· #tilelang (1 messages):

hariprasathvinayagam: <@424952602556497920> no tilelang now focuses on low level optimization


GPU MODE â–· #arm (4 messages):

GitHub Issue on Pytorch, gh200 bug, Thor, ARCH_NATIVE=1


GPU MODE â–· #self-promotion (3 messages):

Prioritized Experience Replay with PPO, ProteinBERT Optimization with Triton, Hierarchical Layouts Intuition


GPU MODE â–· #submissions (1 messages):

A100, Leaderboard Results, Trimul Benchmark


GPU MODE ▷ #factorio-learning-env (20 messages🔥):

LuaPlayer Initialization Warning, RCON Client Version, TCP Port Hardcoding in FactorioInstance, FLE's ABC Base Classes, Multiagent and Gym PR


GPU MODE ▷ #cutlass (10 messages🔥):

CuteDSL vs Triton, CUTLASS performance, sgemm_sm80.cu example optimization, block level swizzle


GPU MODE â–· #singularity-systems (1 messages):

Lattices (dataflow solvers), Graphs (control flow graphs and interference graphs), Generic infrastructure implementation


Moonshot AI (Kimi K-2) ▷ #general-chat (96 messages🔥🔥):

GPT UI, GPT-5 Pro worth it?, GPT-5, OpenAI going bankrupt?, Qwen vs GLM


Nous Research AI ▷ #general (33 messages🔥):

GLM-4.5-Air, Unsloth Dynamic 2.0 GGUF quants, Qwen3-30B-A3B-Thinking-2507, Lyria, Unitree Droid


Nous Research AI â–· #ask-about-llms (4 messages):

LLM repetition, data quality, RLHF to fix repetition


Nous Research AI ▷ #research-papers (26 messages🔥):

Qwen3-4B-Thinking-2507, Jan-v1-4B, Menlo Research, Lucy model, Agentic web search


Nous Research AI ▷ #research-papers (26 messages🔥):

Hermes Impartation, Qwen3 Model, Menlo Research, Lucy Model, Dynamic Task Vector Machine


Eleuther â–· #announcements (1 messages):

Multilingual Representation Learning Workshop, Physical Commonsense Reasoning Benchmark


Eleuther â–· #general (6 messages):

PINN and GNN, Small <8b English text base model, TinyLlama-1.1B


Eleuther ▷ #research (25 messages🔥):

Fourier Extension of RoPE, VO-RoPE, Learnable Dimensionality


Eleuther ▷ #interpretability-general (12 messages🔥):

RLHF for Auto-Interp, SOAR team RLHF, Delphi Hard Negatives, Reasoning Models for Auto-Interp, Tool Calling for Investigation


Eleuther ▷ #lm-thunderdome (35 messages🔥):

Harness dataset pulling, Belebele dataset subsets, Adding internal tasks


Modular (Mojo 🔥) ▷ #general (1 messages):

Mojo-regex optimizations, Apple GPU support


Modular (Mojo 🔥) ▷ #mojo (4 messages):

End-to-end Mojo, IO Model similar to Mojo, Type system features


Modular (Mojo 🔥) ▷ #max (69 messages🔥🔥):

torch.compile backend=MaxCompiler, Apple Metal Integration, Max Graph Optimization, Kyutai Research Lab, ComfyUI


Yannick Kilcher ▷ #general (42 messages🔥):

PSO Guarantees, Francois Chollet AGI Timeline, Yannic AGI Timeline, LLM API Batching, MoE Scheduling


Yannick Kilcher â–· #ml-news (7 messages):

CANN Support, Matrix Game Engine, Nvidia H20 AI Chip, Skyreels based on WAN, MistralAI


DSPy ▷ #general (35 messages🔥):

DSPy 3.0 Release, MLflow 3.0 Integration, Multi-Modal Support, Reasoning Models


Notebook LM â–· #use-cases (7 messages):

NotebookLM, Video transcript


Notebook LM ▷ #general (25 messages🔥):

PDF Upload Issues, Discord Spam, Emoji Customization, JSON to DOCX Conversion, Duplicate Content in Sources


aider (Paul Gauthier) ▷ #general (30 messages🔥):

Gemini API issues, Deepinfra provider for Gemini, Mistral 3.1 release, Native tool calling settings


Cohere ▷ #🧵-general-thread (15 messages🔥):

Audio Embeddings, AI workflows in n8n, Web connector in playground


Cohere ▷ #📣-announcements (1 messages):

Cohere Labs Scholars Program, ML Research, Information Session


Cohere â–· #đź‘‹-introduce-yourself (3 messages):

AI/LLM Evaluation, AI Policy and Governance


LlamaIndex â–· #blog (3 messages):

LlamaCloud, AstraDB, SkySQL, Hallucination-free SQL generation, TypeScript SDK


LlamaIndex ▷ #general (12 messages🔥):

Llama Index Self-Hosting Docs, Acquiring a paid license for Llama Index, RAG Dev Problem Map, Missing GPT-5 Model


Manus.im Discord ▷ #general (11 messages🔥):

Manus Wide Research, Raise Tickets for Support, OPPO unlock, Manus Deployment Issues


tinygrad (George Hotz) â–· #general (5 messages):

FSDP Implementation, Contributing to tinygrad, define_reg Pull Requests


tinygrad (George Hotz) â–· #learn-tinygrad (3 messages):

Subtensor realization, CUDA_ERROR_UNSUPPORTED_PTX_VERSION, tinygrad CUDA support, Cached kernel issues


MCP (Glama) â–· #general (2 messages):

Claude Desktop, Bun command


MCP (Glama) â–· #showcase (4 messages):

Kratos MCP release, AI Agents with MCP book release, MCP Harness usage


MLOps @Chipro â–· #events (5 messages):

System Prompt Reading, Claude vs. Claude Code prompts, Guardrail approaches, Prompt Engineering


LLM Agents (Berkeley MOOC) â–· #mooc-questions (2 messages):

Certificate Disapproval, Anonymous Feedback


Nomic.ai (GPT4All) â–· #general (1 messages):

Strix Halo, HP Z2 Mini