Frozen AI News archive

OpenAI's IMO Gold model also wins IOI Gold

**OpenAI** announced placing **#6 among human coders** at the IOI, reflecting rapid progress in competitive coding AI over the past two years. The **GPT-5** launch faced significant user backlash over restrictive usage limits and removal of model selection control, leading to a reversal and increased limits to **3000 requests per week** for Plus users. Confusion around **GPT-5** naming and benchmarking was highlighted, with critiques on methodological issues comparing models like **Claude** and **Gemini**. Performance reviews of **GPT-5** are mixed, with claims of near-zero hallucinations by **OpenAI** staff but user reports of confidence in hallucinations and steering difficulties. Benchmarks show **GPT-5 mini** performing well on document understanding, while the full **GPT-5** is seen as expensive and middling. On the Chatbot Arena, **Gemini 2.5 Pro** holds a **67%** winrate against **GPT-5 Thinking**. Prompting and model behavior remain key discussion points.

Canonical issue URL

Special RL is all you need?

AI News for 8/8/2025-8/11/2025. We checked 12 subreddits, 544 Twitters and 29 Discords (227 channels, and 30037 messages) for you. Estimated reading time saved (at 200wpm): 2237 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

We know OAI got the IMO Gold performance last month, so it's crazy that we kind of considered not giving the IOI result the same coverage.

These days, tweets serve as press releases, and so Sheryl Hsu got the honor (also of the IMO team) of announcing that they had placed #6 among human coders:

Folks from Ahmed El-Kishky and Jerry Tworek and Noam Brown and Alex Wei reflected on the rapid progress from just 2 years ago when these systems could barely do anything in either competitive categories. Noam's thread offers the most insight into the scaffolds.

and Alex shared some of the challenging aspects of the test.


AI Twitter Recap

The GPT-5 Launch: Performance, Naming, and User Rebellion

Model & Benchmark Developments

Frameworks, Tooling, and Infrastructure

AI Research & Scientific Breakthroughs

Broader Discourse: AI in Society

Humor/Memes


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. gpt-oss-120b Model Performance and Benchmarks Discussion

2. Innovative LLM Training and Distillation Approaches

3. Ollama Integrations and Community Opinions

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. GPT-5 Benchmarking, Performance, and Community Reactions

2. OpenAI's Competitive Advances and Compute Scaling

3. Innovations and Community Tools for Claude AI


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. GPT-5 Rollout, Routers, and Reality Checks

2. New Dev Tooling: CLIs, Agents, and Parallelism

3. Open‑Source Finetuning, Data, and Quantization

4. Multimodal and Long‑Context Experiments


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


OpenAI Discord


Cursor Community Discord


Unsloth AI (Daniel Han) Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Moonshot AI (Kimi K-2) Discord


HuggingFace Discord


Latent Space Discord


Eleuther Discord


Nous Research AI Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


Notebook LM Discord


GPU MODE Discord


LlamaIndex Discord


aider (Paul Gauthier) Discord


DSPy Discord


Manus.im Discord Discord


Cohere Discord


tinygrad (George Hotz) Discord


Nomic.ai (GPT4All) Discord


MCP (Glama) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Torchtune Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #announcements (1 messages):

kesku: https://fixvx.com/perplexity_ai/status/1953537170964459632 <@&1105626802732404746>


Perplexity AI ▷ #general (873 messages🔥🔥🔥):

Gemini AI Video Generation, GPT-5 performance on Perplexity, Comet Browser AI tasks, Accessing Perplexity Pro


Perplexity AI ▷ #sharing (4 messages):

GPT-5 Release, Solar Powered High-Altitude Platform, Gemini Coding


Perplexity AI ▷ #pplx-api (1 messages):

Front-end improvements


LMArena ▷ #general (1436 messages🔥🔥🔥):

GPT-5 Performance, Gemini 2.5 Pro vs GPT-5, Yupp.ai Legitimacy, LM Arena Outage, Claude 4.1 Opus


LMArena ▷ #announcements (3 messages):

Staff AMA, Video Arena, New models, gpt-5-mini-2025-08-07, gpt-5-nano-2025-08-07


OpenAI ▷ #annnouncements (2 messages):

GPT-5, Sam Altman AMA


OpenAI ▷ #ai-discussions (973 messages🔥🔥🔥):

GPT-5, Gemini Flash, Model Routers, Data scrubbing, Local AI


OpenAI ▷ #gpt-4-discussions (75 messages🔥🔥):

GPT-5 rollout and availability, GPT-5 performance and limitations, Firefox data persistence issue, Hosting custom GPTs, AI tools for LinkedIn management


OpenAI ▷ #prompt-engineering (14 messages🔥):

ChatGPT-5, Prompt Engineering, AI Prompt Management Tool, Model Behavior Exploration, LinkedIn Management Service


OpenAI ▷ #api-discussions (14 messages🔥):

ChatGPT-5 Prompt Box Limitations, Prompt Engineering Techniques, AI Prompt Management Tools, Model Behavior Exploration, Alternative tools for large inputs


Cursor Community ▷ #general (841 messages🔥🔥🔥):

GPT-5 Launch, Free GPT-5, GPT-5 Limitations, Cursor CLI, Model Performance Comparison


Cursor Community ▷ #background-agents (8 messages🔥):

PR creation flow issues, Background workers and PR creation, "@cursor fix this issue" magic


Cursor Community ▷ #announcements (1 messages):

Cursor in Terminal


Unsloth AI (Daniel Han) ▷ #general (1016 messages🔥🔥🔥):

GPT-5, Unsloth support for MXFP4, RVC (voice conversion) language specifics, Dataset preparation, GPT-OSS and GGUF


Unsloth AI (Daniel Han) ▷ #introduce-yourself (14 messages🔥):

Model Fine Tuning Costs, Unsloth AI Documentation, Developer Introductions


Unsloth AI (Daniel Han) ▷ #announcements (1 messages):

GPT-OSS, Qwen3-Coder + 2507, Unsloth updates


Unsloth AI (Daniel Han) ▷ #off-topic (15 messages🔥):

LLMs playing board games, GPT-5 performance, Coding with LLMs


Unsloth AI (Daniel Han) ▷ #help (166 messages🔥🔥):

VLLM update fixes, WSL instructions Don't work, GPT-OSS on Tesla T4 is slow, Fine tuning models to write in certain style


Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

loayxz: https://huggingface.co/loay/ArabicOCR-Qwen2.5-VL-7B-Vision


Unsloth AI (Daniel Han) ▷ #research (13 messages🔥):

41M HRM-based Model, Chain-of-Thought Reasoning Mirage, Importance of Datasets, Small Specialized Fine-Tuned Models, Tiny Stories Dataset


OpenRouter (Alex Atallah) ▷ #general (800 messages🔥🔥🔥):

GPT-5 vs GPT-5 Chat, Gemini 3.0 vs GPT-5, Deepseek Switching to Ascend, Horizon Beta Replacement


OpenRouter (Alex Atallah) ▷ #new-models (2 messages):

``


OpenRouter (Alex Atallah) ▷ #discussion (23 messages🔥):

GPT-5 BYOK, o3, OpenRouter Trusted Partner, generation_time, moderation_latency


LM Studio ▷ #general (281 messages🔥🔥):

YouTube downloader alternatives, Custom AI bot, LM Studio vs. VLLM for parallel requests, GLM-4.5 offloading, Qwen model improvements


LM Studio ▷ #hardware-discussion (74 messages🔥🔥):

Apple M4, HX 370, 5080 FE Availability, PSU for 5080 FE and 3090, RTX 3090 for 120b GPT OSS Model


Moonshot AI (Kimi K-2) ▷ #general-chat (214 messages🔥🔥):

GPT-5, Kimi K2, OpenRouter, Qwen, Model Quantization


HuggingFace ▷ #general (182 messages🔥🔥):

GPT-5 release, GPT-OSS finetuning, Eleven Music, Voice companion pipeline, Automatic video cutter


HuggingFace ▷ #i-made-this (8 messages🔥):

AERIS V4 launch, Modular framework for managing persistent memory, Devlancr - Tinder for Developers, AERIS is schizo


Latent Space ▷ #ai-general-chat (145 messages🔥🔥):

GPT-5, Claude Code, Cursor CLI, Model Deprecation, Nitter Maintenance


Latent Space ▷ #ai-announcements (13 messages🔥):

GPT-5, OpenAI Dominance, Transformer Models, GPT-5 Vision, AI General Intelligence (AGI)


Eleuther ▷ #general (115 messages🔥🔥):

NSP vs Attention, Lower compute requirements for training language models, Memory layer for LLMs, GPT-5 drawing incorrect information in images, AR models combined with diffusion models


Eleuther ▷ #research (13 messages🔥):

FineWeb dataset cleanliness, Pythia's Hidden Activation Dynamics, LM Evaluation Harness Exact Match Issues, Learning Rate Schedule Impact


Nous Research AI ▷ #general (83 messages🔥🔥):

GPT-5 Logic Puzzles and Overfitting, Free GPT-5 API Access, Cheap Colab Alternatives, GLM 4.5 Air Performance and Offloading, Multi-GPU setups for MoE models


Nous Research AI ▷ #ask-about-llms (1 messages):

Claude jailbreak


Nous Research AI ▷ #interesting-links (2 messages):

Mechanistic faithfulness, StreamingLLM


Modular (Mojo 🔥) ▷ #general (49 messages🔥):

Mojo TUI library, Textual Python apps, Mojo's inability to create classes, Rust libraries


Modular (Mojo 🔥) ▷ #mojo (12 messages🔥):

Mojo Compiler Register Warnings, VSCode Mojo Extension Instability, Modular Forum, Minecraft Server Rewrite, Minecraft Protocol in Mojo


Modular (Mojo 🔥) ▷ #max (14 messages🔥):

MaxCompiler, LLMs, kernel fusion, torch.compile(), Transformers


Yannick Kilcher ▷ #general (39 messages🔥):

Twitch Streaming, LinkedIn Blogging, Attention Span, Ocean Sound or Fireplace Sound, Gaussian Distribution


Yannick Kilcher ▷ #paper-discussion (3 messages):

AI Avatar, SDXL, Fast Layers vs Slow Layers, Autodifferentiable Architectures, Gradient Estimation


Yannick Kilcher ▷ #ml-news (31 messages🔥):

LLMs for diagnosis, congress.gov bill, Over the counter cold medicine ineffective, Pharmacists prescribing, Tesla special


Notebook LM ▷ #use-cases (6 messages):

NotebookLM Voice, AI Web Builder Tool, Scratchpad Framework, NotebookLM for Binge Watching


Notebook LM ▷ #general (46 messages🔥):

Notebook thumbnails, Audio Overview Issues, Custom Notebooks, Sensitive Content Research, Audio Issues


GPU MODE ▷ #general (10 messages🔥):

Parameter Scaling, Speculative Decoding, Parallel Programming, ROCm Channel Spam


GPU MODE ▷ #triton (1 messages):

Privacy Team Approval for Registration, Registration Process Update


GPU MODE ▷ #cuda (4 messages):

Machine Level Element Type Distinctions, S8/S16 vs U8/U16 Variants


GPU MODE ▷ #beginner (1 messages):

CUDA kernel debugging, Grid-stride loops


GPU MODE ▷ #metal (2 messages):

Naive Matmul Kernels, Memory Access Patterns, Hardware Coalescing


GPU MODE ▷ #self-promotion (4 messages):

Open Source Voxel Renderer, Rust, WebGPU, Data Streaming, Raytracing


GPU MODE ▷ #hardware (1 messages):

paolovic: thank you!


GPU MODE ▷ #factorio-learning-env (12 messages🔥):

Game Engine Speed, Meeting Reschedule, Player Inventory Transfers, Factorio Native Saves


GPU MODE ▷ #cutlass (7 messages):

CuTe Layouts, Jay Shah's Notes on CuTe Layouts, Layout Algebra Counterexamples


GPU MODE ▷ #singularity-systems (2 messages):

Liveness Analysis, Scalar Compilation Performance, Vector Compilation with Autovectorization and SIMTification


GPU MODE ▷ #multi-gpu (2 messages):

Axolotl, N-D Parallelism, HuggingFace Blog


LlamaIndex ▷ #blog (6 messages):

GPT-5, Agent Maze, Zoom RTMS, ZeroEntropy AI rerankers, Claude citations


LlamaIndex ▷ #general (39 messages🔥):

llama-index upgrade for gpt-5, workflow tools not working, OpenAI SDK issue and workaround, AgentWorkflow error, llama_deploy compatibility


aider (Paul Gauthier) ▷ #general (41 messages🔥):

Horizon vs GPT5 for agentic coding, Aider GPT-5 on Azure, Aider version updates, Dad meme thumbs up, Python 3.13 support


aider (Paul Gauthier) ▷ #questions-and-tips (4 messages):

Cursor alternative design, OpenRouter's GPT5 errors, aider config parsing failures


DSPy ▷ #general (41 messages🔥):

Context7 MCP Server, Claude Code Tooling, DSPy Tool Calling, CrewAI Prompts Optimization with DSPy


Manus.im Discord ▷ #general (14 messages🔥):

Annual Membership Billing Error, Inherit Feature Problems, Login Error, Missing Credits, Manus vs GPT5


Cohere ▷ #🧵-general-thread (4 messages):

command-a-vision-07-2025 timing out, Embed v4 vs v3 for vector search, AI Knowledge Domains


Cohere ▷ #📣-announcements (1 messages):

AI Agent capabilities, Generative AI, Workflow automation, Data security, Compliance


Cohere ▷ #👋-introduce-yourself (6 messages):

New member introductions, Trading systems with RL and AI agents, Transformers and GNNs


Cohere ▷ #🧭-status-feed (1 messages):

Command-a-vision-07-2025, degraded performance, Cohere Status Page


Cohere ▷ #🔬-research (1 messages):

masaru.yamada: Great


tinygrad (George Hotz) ▷ #general (6 messages):

tensor to mathtraits, unit tests failures, github actions


tinygrad (George Hotz) ▷ #learn-tinygrad (1 messages):

ShapeTracker Visualization Tool


Nomic.ai (GPT4All) ▷ #general (6 messages):

GPT-5 Rumors, GPT-OSS-20B-GUFF Installation Issues, GPT4All Update Status, GPT-ASS Critique


MCP (Glama) ▷ #showcase (2 messages):

MCPOmni Connect, OmniAgent, AI agent builder