Frozen AI News archive

Gemini 2.5 Deep Think finally ships

**OpenAI** is rumored to soon launch new **GPT-OSS** and **GPT-5** models amid drama with **Anthropic** revoking access to **Claude**. **Google DeepMind** quietly launched **Gemini 2.5 Deep Think**, a model optimized for parallel thinking that achieved gold-medal level at the IMO and excels in reasoning, coding, and creative tasks. Leaks suggest **OpenAI** is developing a **120B MoE** and a **20B** model with advanced attention mechanisms. Chinese AI companies like **Kimi Moonshot**, **Alibaba**, and **ZHIpu AI** are releasing faster and more capable open models such as **kimi-k2-turbo-preview**, **Qwen3-Coder-Flash**, and **GLM-4.5**, signaling strong momentum and potential to surpass the U.S. in AI development. *"The final checkpoint was selected just 5 hours before the IMO problems were released,"* highlighting rapid development cycles.

Canonical issue URL

Parallel thinking is all you need.

AI News for 7/31/2025-8/1/2025. We checked 12 subreddits, 544 Twitters and 29 Discords (227 channels, and 7130 messages) for you. Estimated reading time saved (at 200wpm): 614 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Lots of rumors and leaks about the OpenAI GPT-OSS and GPT-5 models are flying around, meaning a launch is soon. Ahead of this highly anticipated launch there is some drama around Anthropic revoking OpenAI's Claude access.

In the meantime GDM is quietly staying above the fray, just doing a clean launch of the Deep Think model (same model, but tuned down to be dumber than the one that got the IMO Gold a few days ago). It offers some impressive boosts on SOTA benchmarks, noticeably they are much higher boosts on the base model than o3 pro:

in table format:

There's more info on the model card, but not a lot so we can save you the click:

There's also misc videos to see on the Deep Think parallel thinking, but we (biased) would actually recommend the full keynote from Jack Rae who led the work for 2.5 Deep Think and even commented on where they are going next:


AI Twitter Recap

Model Releases, Leaks, and Performance

Infrastructure, Efficiency, and Hardware

Agent Tooling, Frameworks, and Development

Company News, Funding, and Strategy

Research, AI Safety, and Datasets

Humor/Memes


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. OpenAI 120B Model Leaks and Speculation

2. Qwen3 Model Launches and Benchmarks

3. DocStrange Open Source Data Extraction Release

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Gemini 2.5 Deep Think Launch and Performance Benchmarks

2. WAN 2.2, Flux Krea, and Current Text-to-Image/Video Model Comparisons

3. OpenAI & AI Industry Model/API Rumors and Announcements


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Flash Preview 05-20

Theme 1. Frontier LLM Developments & Speculation

Theme 2. Open-Source & Local LLM Optimization

Theme 3. AI Coding & Agent Tooling

Theme 4. Hardware & Performance Benchmarking

Theme 5. AI Product Pricing & User Experience


Discord: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LMArena Discord


Cursor Community Discord


OpenAI Discord


LM Studio Discord


OpenRouter (Alex Atallah) Discord


Moonshot AI (Kimi K-2) Discord


Nous Research AI Discord


Latent Space Discord


Yannick Kilcher Discord


Notebook LM Discord


Eleuther Discord


aider (Paul Gauthier) Discord


MCP (Glama) Discord


GPU MODE Discord


HuggingFace Discord


Cohere Discord


Manus.im Discord Discord


LlamaIndex Discord


DSPy Discord


Modular (Mojo 🔥) Discord


Torchtune Discord


LLM Agents (Berkeley MOOC) Discord


Codeium (Windsurf) Discord


Nomic.ai (GPT4All) Discord


The tinygrad (George Hotz) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1048 messages🔥🔥🔥):

Comet Browser Invites, Image Generation Issues on Perplexity Pro, Free Perplexity Pro for Airtel Subscribers in India, GPT-5 Release Speculation, Model Performance Comparison


Perplexity AI ▷ #sharing (7 messages):

Shareable threads, RAG without embeddings, Trump-Medvedev


Perplexity AI ▷ #pplx-api (14 messages🔥):

search_domain_filter, Moderator Bot Usage, Image Uploading via API


Unsloth AI (Daniel Han) ▷ #general (1099 messages🔥🔥🔥):

GPT-5 speculation, Qwen3 model, Cogito V2, Unsloth GRPO and TRL, H100 and batch sizes


Unsloth AI (Daniel Han) ▷ #introduce-yourself (4 messages):

New member introduction, Community assistance


Unsloth AI (Daniel Han) ▷ #off-topic (74 messages🔥🔥):

VITS checkpoint training insights, On-device VITS system on iOS, Children voices recording, Avocodo and iSTFTNet for audio fidelity, Universal vocoder for Speech LLM


Unsloth AI (Daniel Han) ▷ #help (207 messages🔥🔥):

Circular Import Error, RuntimeError with Merged Model Loading, UV venv performance, Qwen3 tool calling problems, Qwen3-Coder-30B-A3B-Instruct-1M-Q8_0.gguf on vLLM


Unsloth AI (Daniel Han) ▷ #showcase (8 messages🔥):

Unsloth Dynamic Quantization, Qwen3 30B-A3B, Space Invaders refined, Roleplay AI finetuning, Gemini Refusals


Unsloth AI (Daniel Han) ▷ #research (4 messages):

Gemma 3 1B garbage, finetuning project, continuous training of loras


Unsloth AI (Daniel Han) ▷ #unsloth-bot (114 messages🔥🔥):

GRO Trainer dataset mapping, Chat template cut off, GRPOTrainer config, Sequence dictionary (seq-dict), Unsloth shape dynamically changes


LMArena ▷ #general (968 messages🔥🔥🔥):

Arena Visibility, Leaderboard Tooltips, Personal Info in Datasets, Gemini's Repetitive Tendencies, Gemini 2.5 Deepthink


LMArena ▷ #announcements (1 messages):

Veo 3, Image-to-Video, Audio capabilities


Cursor Community ▷ #general (580 messages🔥🔥🔥):

Background agents, Improving cursor setup, Cursor freezing issues, YOLO mode activation, Vibe coding strategy


Cursor Community ▷ #background-agents (1 messages):

lintaffy: oh, my ba is still loading for the easy command....


OpenAI ▷ #ai-discussions (410 messages🔥🔥🔥):

Function Calling vs XML, AI Superintelligence Bio-Weapons, Grok4 vs GPT5, Horizon Alpha Performance, Large Context Windows


OpenAI ▷ #gpt-4-discussions (11 messages🔥):

Agent Mode Confusion, ChatGPT Agents vs Regular GPT, GPT-4o auto reasoning, Missing Chat History


OpenAI ▷ #prompt-engineering (1 messages):

``


OpenAI ▷ #api-discussions (1 messages):

``


LM Studio ▷ #general (325 messages🔥🔥):

Image-to-video prompt generation in LM Studio, LM Studio's lack of roadmap, LM Studio's Plugin System, Connecting to LM Studio API from other computers on the network, Qwen3 Coder model support on LM Studio


LM Studio ▷ #hardware-discussion (69 messages🔥🔥):

Nvidia Driver 580.88, Second-hand servers, Partial KV Cache Offload, Mac mini M4 vs RTX 3070, Next-gen GPUs


OpenRouter (Alex Atallah) ▷ #app-showcase (11 messages🔥):

PyrenzAI launch, Personality.gg, OpenRouter PKCE, PyrenzAI feedback


OpenRouter (Alex Atallah) ▷ #general (242 messages🔥🔥):

API Errors, Deepseek r1, Free Models, Horizon Alpha, API Key credit limit


OpenRouter (Alex Atallah) ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter (Alex Atallah) ▷ #discussion (23 messages🔥):

Groq OpenBench, Provider Benchmarks, GPQA Evals, Inspect.ai, Prompt Caching for Kimi K2 and GLM 4.5


Moonshot AI (Kimi K-2) ▷ #announcements (2 messages):

Kimi K2 Turbo, Moonshot AI Forum


Moonshot AI (Kimi K-2) ▷ #general-chat (126 messages🔥🔥):

Kimi vs Claude, Kimi K2 Turbo pricing and speed, Using Kimi K2 Turbo in Claude code, Chinese companies video generation, Kimi K2's prompt format similar to ChatGPT


Nous Research AI ▷ #general (110 messages🔥🔥):

Hermes-3 dataset, Unitree R1 robot, OpenAI's Horizon Alpha model, Quantization challenges, SmolLM and Qwen2.5


Nous Research AI ▷ #ask-about-llms (4 messages):

Input Tokens per Second, Prefill, Gemma, Time to First Token


Nous Research AI ▷ #research-papers (3 messages):

OSS Model Training Script, Metaprogramming and DAG->HRM->code automation


Nous Research AI ▷ #interesting-links (5 messages):

AnythingLLM, Neuronpedia, Data Sovereignty


Nous Research AI ▷ #research-papers (3 messages):

OSS model training script, Metaprogramming and DAG->HRM->code automation, Federated cycles between clones in ray nodes


Latent Space ▷ #ai-general-chat (112 messages🔥🔥):

Cline's $32M seed funding, CLI orchestration layer, Subagents and Claude Code Office Hours, Bytedance's Seed Diffusion LLM for Code, Open-License Hybrid Reasoning Models


Latent Space ▷ #ai-announcements (4 messages):

Cline pod writeup, Latent Space Podcast, Open Source Code Agent


Yannick Kilcher ▷ #general (86 messages🔥🔥):

RAG query expansion techniques, Sentence embeddings vs. token embeddings, Cross-encoders for semantic similarity, Knowledge Graphs for information retrieval, LLMs and question-answer co-occurrence


Yannick Kilcher ▷ #paper-discussion (3 messages):

The Cinema AI, Generating Movie Scenes


Yannick Kilcher ▷ #ml-news (4 messages):

NVIDIA Chips, Nintendo Switch


Notebook LM ▷ #use-cases (27 messages🔥):

Audio pause timing in slide changes, Portuguese language support for explainer videos, NotebookLM for personalized podcasts, Canvas infographics from Perplexity Deep Research


Notebook LM ▷ #general (65 messages🔥🔥):

Offline access to NotebookLM studio material, Video overview rollout issues, NotebookLM and Gemini API for custom RAG pipeline, Comet browser extension for NotebookLM, Audio Overviews language and duration limitations


Eleuther ▷ #announcements (1 messages):

Attention probes, Linear probes, Overfitting, Optimization issues


Eleuther ▷ #general (11 messages🔥):

LLMs on low-power edge devices offshore, Gemini-2.5-flash biased ranks for gemma responses, OpenAI open source model config, MLA vs MHA generalization


Eleuther ▷ #research (41 messages🔥):

RoPE is near optimal, Weight tying is bad, semantic search and RAG


Eleuther ▷ #scaling-laws (1 messages):

EleutherAI Website PR, Tensor Program papers, Yang et al paper


Eleuther ▷ #interpretability-general (5 messages):

HF transformers update, Llama & Qwen residual streams, Attention Probes Work, NIAH datasets


Eleuther ▷ #lm-thunderdome (1 messages):

``


Eleuther ▷ #gpt-neox-dev (14 messages🔥):

MIT Collaboration on LLM Training, Containerization Issues, CUDA Issues, DeepSpeed checkpoint inspection


aider (Paul Gauthier) ▷ #general (61 messages🔥🔥):

Aider Appreciation, SGLang and Qwen Speed, 4090 Mobo and Case, Aider vs Other Tools, Claude Code Context Limits


aider (Paul Gauthier) ▷ #questions-and-tips (10 messages🔥):

Qwen3 30B A3B Coder Benchmarking, LM Studio Usage, llama.cpp server + docker aider benchmark, aider + claude-code max subscription integration, Gemini 2.5 Pro


MCP (Glama) ▷ #general (43 messages🔥):

Security MCP Check Tool, PayMCP Payment Layer, PageRank for MCP Servers, MCP Eval Platforms, Gateway for Agent Tool Search


MCP (Glama) ▷ #showcase (1 messages):

JSON MCP Server, LLM Efficiency with JSON, Schema Generation for JSON, Token Savings


GPU MODE ▷ #general (8 messages🔥):

Hylo Programming Language, Value Semantics, Halide, Scala 3/Scala Native, Heterogenous Programming


GPU MODE ▷ #triton (1 messages):

Triton Kernel AI Agent, GEAK benchmarks


GPU MODE ▷ #cuda (4 messages):

Profiling Copilot, __launch_bounds__ fix for register count issue, setmaxnreg ignored due to extern call


GPU MODE ▷ #torch (1 messages):

CheckpointPolicy with Custom Kernels, Functorch API


GPU MODE ▷ #cool-links (1 messages):

MI300X FP8 benchmarks on AMD, AMD MI300X vs H200 vs B200, FP8 Data Parallel Benchmarks


GPU MODE ▷ #beginner (1 messages):

celis1702: thank you both so much for your clear explanations and for sharing these details!


GPU MODE ▷ #jax (2 messages):

JIT function, JAXPR printing, Static arguments


GPU MODE ▷ #irl-meetup (2 messages):

Agreement, Acknowledgement


GPU MODE ▷ #rocm (2 messages):

Profiling llama.cpp with rocprofilerv3, AMD machine for GGUF


GPU MODE ▷ #self-promotion (1 messages):

C/ua Hiring, AI Agents Infrastructure, Founding Engineer Roles


GPU MODE ▷ #🍿 (1 messages):

tonic_1: really glad i was nosey enough to check this convo out 🙂 super excited about this 🙂


GPU MODE ▷ #factorio-learning-env (7 messages):

README update on Resource vs Prototype, RCON client disconnects, Blueprint VQA pipelines


GPU MODE ▷ #singularity-systems (6 messages):

picocuda compiler, elements graph data structures, scalar compilation, GPU compilation, tinygrad's AMD GPU driver


GPU MODE ▷ #multi-gpu (4 messages):

DTensor, Basic Parallelism Schemas, Shape Rotation, DTensor Problems, Marksaroufim visualizations


HuggingFace ▷ #general (26 messages🔥):

Flux Krea model, Synthetic Datasets with HF jobs, AMD GPU for EM image segmentation, Llama CP model path, Gemini-2.5-flash bias


HuggingFace ▷ #today-im-learning (2 messages):

Note-taking tools, Remnote


HuggingFace ▷ #cool-finds (2 messages):

AgentUp, Emergence AI, LongMemEval Benchmark


HuggingFace ▷ #i-made-this (3 messages):

smolagents.js, CodeBoarding, Qwen3-30B-A3B-Instruct-2507


HuggingFace ▷ #reading-group (1 messages):

cakiki: <@570737726991761409> please don't promote paid content in the server


HuggingFace ▷ #computer-vision (2 messages):

Discriminator Learning Rate, GAN Loss Issues, Debugging GANs


HuggingFace ▷ #agents-course (2 messages):

Llama 4 Access, Qwen Model, DeepSeek-R1


Cohere ▷ #🧵-general-thread (21 messages🔥):

Cohere API context window size discrepancy, HackRx 6.0 AI hackathon Rate Limit, Cohere Enterprise Plan, Cohere website login error, Cohere Support Team introduction


Cohere ▷ #🔌-api-discussions (1 messages):

kaludi: Is there something going on with the API? We are getting multiple timeouts for our queries


Cohere ▷ #👋-introduce-yourself (6 messages):

Samsung Biologics AI Architect, AI Developer with LLM Workflows, Dell's Engineering Technologist, Mobile & JS-fullstack AI Application Developer


Manus.im Discord ▷ #general (17 messages🔥):

DM spam, Wide research, Cloudflare issues, Manus AI, Daily refresh credits


LlamaIndex ▷ #blog (2 messages):

LlamaIndex, Novita Labs, Gemini Live


LlamaIndex ▷ #general (13 messages🔥):

Agentic AI Code Assistance, Git-Style Branching for LLM Conversations, LlamaIndex Parsers for PDFs and PPTs, AI+Blockchain for on-chain AI agents


DSPy ▷ #show-and-tell (1 messages):

dbreunig: https://www.dbreunig.com/2025/07/31/how-kimi-rl-ed-qualitative-data-to-write-better.html


DSPy ▷ #general (2 messages):

DSpill, Yaron Minsky, Quant Bros


Modular (Mojo 🔥) ▷ #general (2 messages):

Mojo installation issues, GitHub issue reporting, Detailed logs for debugging


Modular (Mojo 🔥) ▷ #mojo (1 messages):

Tail Call Elimination, Print/Log Statements, Minimal Examples


Torchtune ▷ #general (3 messages):

OpenAI Model Leak, Mixture of Experts, FP4 weights


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (1 messages):

Course Quizzes Availability, Google Forms Reopening


Codeium (Windsurf) ▷ #announcements (1 messages):

Qwen3-Coder, Token Speed, US Servers