Frozen AI News archive

GLM-4.5: Deeper, Headier, & better than Kimi/Qwen/DeepSeek (SOTA China LLM?)

**Z.ai** (Zhipu AI) released the **GLM-4.5-355B-A32B** and **GLM-4.5-Air-106B-A12B** open weights models, claiming state-of-the-art performance competitive with **Claude 4 Opus**, **Grok 4**, and OpenAI's **o3**. These models emphasize token efficiency and efficient reinforcement learning training validated by the Muon optimizer. **Alibaba Qwen** introduced **Group Sequence Policy Optimization (GSPO)**, a new reinforcement learning algorithm powering the **Qwen3** model suite, integrated into Hugging Face's TRL library. Speculation surrounds mystery models "summit" and "zenith" as potential **GPT-5** variants based on **GPT-4.1** architecture. **Qwen3-Coder** shows strong coding benchmark results, rivaling **Claude Sonnet 4** and **Kimi K2**. The rise of powerful Chinese open-source models like **GLM-4.5**, **Wan-2.2**, and **Qwen3 Coder** contrasts with a slowdown from Western labs such as **OpenAI**.

Canonical issue URL

Muon is all you need?

AI News for 7/25/2025-7/28/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (227 channels, and 16798 messages) for you. Estimated reading time saved (at 200wpm): 1388 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

A banner day for Chinese open weights AI. The generative media types should definitely take a look at Wan 2.2, but most AI Engineers should be apprised of Z.ai's (better known as Zhipu, one of the AI Tigers) GLM-4.5-355B-A32B and GLM-4.5-Air-106B-A12B released today. They make a VERY strong claim (to be independently verified) of being not only the strongest open weights model (beating the previous SOTA Kimi K-2) but also highly competitive with and often better than heavyweight SOTA models like Claude 4 Opus, Grok 4, and OpenAI's o3:

Beyond just the table stakes benchmarks to be considered a frontier model, Z.ai also commendably emphasizes new measurements that matter greatly for agentic use, including token efficiency (perhaps the hardest metric of all)

No paper yet, but the blog post offers some interesting details on architecture choice and efficient RL training. GLM 4.5 is the second large model this month to validate the Muon optimizer at scale.


AI Twitter Recap

New Model Releases & Performance

AI Agents & Agentic Workflows

Video & Multimodal Generation

Infrastructure, Tooling & Efficiency

New AI Techniques & Research

Industry Trends & Commentary

Humor & Memes


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. GLM-4.5 Announcements, Launches, and Collections

2. Wan 2.2 Open Video Generation Model Releases and Benchmarks

3. Specialized LLM Launches for Niche Applications (UI, Instruct, Edge Devices)

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Wan2.2 Video Model Release, Benchmarks, and Community Tests

2. OpenAI GPT-5 Model Leap, Performance, and Impact Discussions

3. Claude Code, Agents, and Plugin Ecosystem: Community Tools and Rate Limit Policies


AI Discord Recap

A summary of Summaries of Summaries by X.ai Grok-4

Theme 1: Model Mayhem: New Releases Battle for Supremacy

Theme 2: Fine-Tuning Fiascos and Optimizer Overhauls

Theme 3: Agent Antics: Protocols, Payments, and Security Shenanigans

Theme 4: Hardware Havoc: GPUs Grapple with AI Demands

Theme 5: Benchmark Brawls and Evaluation Exposés


Discord: High level Discord summaries

Unsloth AI (Daniel Han) Discord


LMArena Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


Moonshot AI (Kimi K-2) Discord


Cursor Community Discord


LM Studio Discord


Eleuther Discord


HuggingFace Discord


Latent Space Discord


Modular (Mojo 🔥) Discord


GPU MODE Discord


Nous Research AI Discord


Yannick Kilcher Discord


Manus.im Discord Discord


DSPy Discord


aider (Paul Gauthier) Discord


Notebook LM Discord


LlamaIndex Discord


MCP (Glama) Discord


tinygrad (George Hotz) Discord


Cohere Discord


LLM Agents (Berkeley MOOC) Discord


Nomic.ai (GPT4All) Discord


Torchtune Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (1157 messages🔥🔥🔥):

Liquid LFM2 models, Muting audio in Python, Detecting start and end of audio wave, Doctors recommending more salt and carbs, Side effects of long term AI usage on the brain


Unsloth AI (Daniel Han) ▷ #introduce-yourself (3 messages):

Self-hosting, Homelab, vllm, Ollama / llama.cpp, Schema enforcement


Unsloth AI (Daniel Han) ▷ #off-topic (1 messages):

jamessmith1990526: Hi ,Roland


Unsloth AI (Daniel Han) ▷ #help (284 messages🔥🔥):

Gemma 3 fine-tuning errors, SFT Dataset Format, Training Time for Gemma 3 12b, Qwen3 model, GGUF Conversion


Unsloth AI (Daniel Han) ▷ #showcase (10 messages🔥):

Japanese TTS Release, Geminized Qwen3 Model, Gemma 3 4b Finetune with GRPO, Finetuning Gemma 3 4b instruct


Unsloth AI (Daniel Han) ▷ #research (54 messages🔥):

Transformers vs Unsloth, Gemma 3/3n with GRPO, HRM Model, Video-Language Finetuning, LLM Quantization


Unsloth AI (Daniel Han) ▷ #unsloth-bot (141 messages🔥🔥):

Best models under 1B parameters, Qwen model release, Unsloth soft prompt tuning, Model selection for fine-tuning, Training limits on Gemma 3


LMArena ▷ #general (1209 messages🔥🔥🔥):

GPT-5 Speculation, LM Arena Model Testing, Model Evaluation and Benchmarking, Open Source Models and Alternatives, Apple's AI Strategy


LMArena ▷ #announcements (1 messages):

GLM-4.5, GLM-4.5 Air


OpenAI ▷ #ai-discussions (838 messages🔥🔥🔥):

Image generation, Color bias in AI, GPT image generation quality, Mind Uploading, AI's role in mental health


OpenAI ▷ #gpt-4-discussions (20 messages🔥):

GPT-4o Coding Performance, Zenith Model, GPT-5 Speculation, GPT @mentions bug


OpenAI ▷ #prompt-engineering (50 messages🔥):

emotional structuring through prompts, clarity vs ambiguity in prompts, core of prompt engineering, anti-sychophancy custom instructions, training the model


OpenAI ▷ #api-discussions (50 messages🔥):

Prompt Engineering, Training the model, Custom Instructions, Model Memories, Blog Post writing with ChatGPT


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

toven: Chutes and Targon are experiencing downtime. Users are reporting a spike in 502s


OpenRouter (Alex Atallah) ▷ #app-showcase (31 messages🔥):

Ramparts security scanner, Model Context Protocol (MCP), Tool interface vulnerabilities, t3.chat sync DB, Cloudflare R2 storage


OpenRouter (Alex Atallah) ▷ #general (1031 messages🔥🔥🔥):

OpenRouter Rate Limits, NSFW Content with Bots, Alternative Models to Deepseek, Slenderman and Creepypasta Bots, Payment Issues on OpenRouter


OpenRouter (Alex Atallah) ▷ #new-models (2 messages):

``


OpenRouter (Alex Atallah) ▷ #discussion (35 messages🔥):

Wandb Inference vs OpenRouter, Compute Exchange for Spare GPUs, Payment Processing Complaints, Power User Tool for OR APIs, Chutes Pricing and Reliability


Moonshot AI (Kimi K-2) ▷ #general-chat (925 messages🔥🔥🔥):

Kimi K2, Gemini, Claude, Open Source Models, Agentic coding tools


Cursor Community ▷ #general (546 messages🔥🔥🔥):

Cursor Auto Mode, Cursor's new pricing, Claude Code integration with Cursor, Qwen3 Coder vs Claude Sonnet 4, Cursor performance issues


Cursor Community ▷ #background-agents (5 messages):

Background Agents Bugs, Git push fails, Support quality, lack of remote connection


LM Studio ▷ #general (293 messages🔥🔥):

Qwen3-Coder New Models, LM Studio Plugins, LM Studio Updates on Remote Box, GPU recommendation, LLM for career advise


LM Studio ▷ #hardware-discussion (157 messages🔥🔥):

GPU for LLMs, AMD vs Nvidia for AI, Expandable GPU Memory, Laptop Failure Rates, eGPUs over USB4


Eleuther ▷ #general (132 messages🔥🔥):

SOAR program competitiveness, Intent-aware semantic search, ACL conference, Open-source AI in extreme environments


Eleuther ▷ #research (117 messages🔥🔥):

KV Cache Distillation, LLM as a Judge, Credit Assignment with LLMs, NeurIPS Rebuttals, RoPE


Eleuther ▷ #interpretability-general (4 messages):

LLM Security Tutorial, Recommended Reading


Eleuther ▷ #lm-thunderdome (30 messages🔥):

Llama-3 eval harness configuration, SQuAD v1 vs v2, F1 score calculation in SQuAD


Eleuther ▷ #gpt-neox-dev (9 messages🔥):

Two-Level Checkpointing, Async Checkpointing, GPT-NeoX Training Framework, TokenSmith


HuggingFace ▷ #general (179 messages🔥🔥):

GPU Recommendations for FOSS AI, SYCL vs CUDA for AI Development, Qwen3-Thinking model requirements, Integrating HF API with Open WebUI and LiteLLM, ChatGPT Experience Verification


HuggingFace ▷ #cool-finds (2 messages):

Dark Knowledge


HuggingFace ▷ #i-made-this (8 messages🔥):

TinyVision, SamosaGPT, Experimental Ultra Low-Parameter Models, Serverless Agent Platform, Byte-Vision


HuggingFace ▷ #reading-group (4 messages):

Weighted Colored Graphs, Topological Data Analysis, Graph Spectral Theory, Graph Neural Networks


HuggingFace ▷ #computer-vision (8 messages🔥):

Image style dimensionality, Intrinsic dimension, Residual convolution layers, Clip augmentation


HuggingFace ▷ #NLP (8 messages🔥):

Laws of Exponentiation and Logarithms, Intent-aware Semantic Search, Knowledge Graphs for Semantic Search, Graph Database Search Methods


HuggingFace ▷ #smol-course (2 messages):

Google VEO3 costs, GPU costs


HuggingFace ▷ #agents-course (10 messages🔥):

HF tokens, Ollama, Qwen, Gemini, Mistral


Latent Space ▷ #ai-general-chat (189 messages🔥🔥):

Qwen, Meta Superintelligence Labs, Quantization, Huggingface business model, Model Context Protocol


Modular (Mojo 🔥) ▷ #general (48 messages🔥):

Mojo GPU Training Libraries, Nabla vs JAX, Mojo MAX Interface, Mojo vs Rust, Bitnet model in Modular


Modular (Mojo 🔥) ▷ #mojo (114 messages🔥🔥):

Python Interop Nanobind vs Cython, Mojo FFI, Mojo GPU Support, Mojo Compiler MetaProgramming


Modular (Mojo 🔥) ▷ #max (11 messages🔥):

max cli version, Intel GPU support in MAX, OneMKL install regression, PyTorch Dependency in MAX


GPU MODE ▷ #general (33 messages🔥):

Jane Street Hackathon, Multi-AI Agent System, Fractal Renderer, Graph Replay Dispatch, Structured Output with JSON Schema


GPU MODE ▷ #triton (7 messages):

Triton CUDA Errors, PY_SSIZE_T_CLEAN macro bug, Profiling Triton Kernels, GEMM Ping-Pong Schedule


GPU MODE ▷ #cuda (3 messages):

nsight-copilot, nsight-copilot approval times, nsight-copilot claude


GPU MODE ▷ #announcements (1 messages):

marksaroufim: <@&1343042150077562890> starting with Ali Hassani on Neighborhood attention now!


GPU MODE ▷ #cool-links (4 messages):

Inference Optimization, Attention Mechanisms, MQA, GQA, MLA, GTA and GLA


GPU MODE ▷ #jobs (1 messages):

Global Hiring for Full-Time Positions, Intern Hiring in the United States


GPU MODE ▷ #beginner (3 messages):

Flash Attention 2 installation, HuggingFace CLI for CI/CD, Optimized Kernels for Hackathon


GPU MODE ▷ #self-promotion (2 messages):

AI Alignment Research Program, High-Level APIs Learning Series


GPU MODE ▷ #general-leaderboard (2 messages):

Submission Errors, Submission ID, Code Samples


GPU MODE ▷ #factorio-learning-env (53 messages🔥):

Blue/Green Docker Servers for Factorio, Factorio Save Files vs FLE Game State, Python vs Bash for Docker Management, Multiplayer Mod Issues, Episode Boundaries and State Resets


GPU MODE ▷ #cutlass (20 messages🔥):

Cutlass Software Pipelining, CuTeDSL vs CuTe C++, CuTeDSL Persistent Kernel Issue, TV-layout visualizer for cute-dsl


GPU MODE ▷ #general (10 messages🔥):

GPU Mode Leaderboard, PMPP problems, AMD Channels


GPU MODE ▷ #multi-gpu (9 messages🔥):

multi-GPU lexicon, DTensor tutorial, splitting a GPU


Nous Research AI ▷ #general (130 messages🔥🔥):

Atropos Updates, Qwen Performance, GPT-5 Speculation, GLM-4.5, MoE Models vs Dense Models


Nous Research AI ▷ #research-papers (3 messages):

Overhyped AI Papers, Mereal Azure


Nous Research AI ▷ #interesting-links (11 messages🔥):

AI Gatekeeping, Philosophical Side Quests in AI, Hyperstim Patch for ChatGPT, Small Model Experiments


Nous Research AI ▷ #research-papers (3 messages):

Overhyped Claims, mereal.azure paper


Yannick Kilcher ▷ #general (116 messages🔥🔥):

LLM Context Manager, Downvotes in Web3, Neural Network Pruning, Intent-Aware Semantic Search, AI Demonic Narratives


Yannick Kilcher ▷ #paper-discussion (10 messages🔥):

Peer Review, Math heavy papers, Learning PDEs, NAS Transformers


Yannick Kilcher ▷ #agents (1 messages):

Amazon Q, Prompt Injection, Security Vulnerabilities


Yannick Kilcher ▷ #ml-news (7 messages):

Youtube pushes shorts, Recommendation algorithm, personalized content


Manus.im Discord ▷ #general (71 messages🔥🔥):

Manus AI, credits consumed, vibe coding challenge, GPT prompts, Manus Fellow in Switzerland


DSPy ▷ #show-and-tell (2 messages):

GEPA: Reflective Prompt Evolution, Prompt Optimization, DSPy Optimizer, LLM Reflection


DSPy ▷ #papers (2 messages):

DSPy optimizers, New Arxiv Paper


DSPy ▷ #general (57 messages🔥🔥):

Context Engineering Definition, MLSys DSPy Talk, Online RL for Personalization, GEPA: Reflective Prompt Evolution, DSPy Optimizer Roadmap


aider (Paul Gauthier) ▷ #general (43 messages🔥):

Aider configuration, Qwen3-Coder pricing and context, AI Code Editor Benchmarks, Aider modes, OpenAI API tokens


aider (Paul Gauthier) ▷ #questions-and-tips (11 messages🔥):

Kimi VL Model, OpenRouter Free Models, Lisp Coding with LLMs, Aider Diff Application Confirmation


Notebook LM ▷ #announcements (1 messages):

Featured Notebooks Rollout, NotebookLM Homepage Access


Notebook LM ▷ #use-cases (15 messages🔥):

Notebook LM with academic material, Notebook LM not scraping forum pages, AI Studio Build app workflow, notebookLM interface, use Notebook to write cover letters and make resume


Notebook LM ▷ #general (31 messages🔥):

NotebookLM Mind Maps for Legal Jargon, Obsidian and NotebookLM Pairing, Uploading PDFs to NotebookLM, Podcast Personalization Issues


LlamaIndex ▷ #announcements (1 messages):

FlowMaker, S3 Vector Store, LlamaParse, n8n nodes for LlamaCloud, Gemini Live voice agent


LlamaIndex ▷ #blog (3 messages):

Gemini Integration, Production Agents, Oxylabs Web Scraping, Cost-Effective AI Agents


LlamaIndex ▷ #general (39 messages🔥):

LlamaIndex OpenTelemetry, Intent-aware semantic search, Knowledge Graphs, Property Graph Index


LlamaIndex ▷ #ai-discussion (1 messages):

twitch_voco: https://leaksdaily.com/?ref=5d681792 @everyone


MCP (Glama) ▷ #general (40 messages🔥):

Glama MCP server tool count issue, Automating Javascript/Typescript linting, Agent payments vs human payments, Monetizing Agents, AI App Store


MCP (Glama) ▷ #showcase (3 messages):

fast-agent, Mermaid diagrams, MCP expert, Tone-of-voice, Goose Desktop


tinygrad (George Hotz) ▷ #general (30 messages🔥):

tinygrad meeting notes, GPUCounter issues, Llama3 runs, Disk raw benchmarks, MLPerf regressions


tinygrad (George Hotz) ▷ #learn-tinygrad (1 messages):

Tinygrad Kernel Explanation, Understanding George Hotz's Theory Message


Cohere ▷ #🧵-general-thread (16 messages🔥):

command-r-plus deprecation, command-a-03-2024 recommendation, LLM testing, Cohere Guides and API reference, LLMU


Cohere ▷ #🔌-api-discussions (6 messages):

Cohere API Kilo Code Error, Cohere Dashboard Fine-Tuning Failure


Cohere ▷ #👋-introduce-yourself (9 messages🔥):

Recursive Systems, AI Cyber Security, AI Engineering, ML in Robotics, ML research and agentic modelling


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (15 messages🔥):

LLM Agents MOOC certificate, Open source vs Closed source instruction training, Reopening quizzes from previous cohort, Archived quizzes from previous cohort


Nomic.ai (GPT4All) ▷ #general (8 messages🔥):

M1 Max for local projects, Discord mass invite issue, Blockchain and AI/ML specialist introduction, Developer collaboration invitation


Torchtune ▷ #dev (3 messages):

DCP Information Leaks, RL Test Timings, CI Debugging