Frozen AI News archive

OpenAI's gpt-oss 20B and 120B, Claude Opus 4.1, DeepMind Genie 3

**OpenAI** released the **gpt-oss** family, including **gpt-oss-120b** and **gpt-oss-20b**, their first open-weight models since GPT-2, designed for agentic tasks and licensed under **Apache 2.0**. These models use a **Mixture-of-Experts (MoE)** architecture with wide vs. deep design and innovative features like bias units in attention and a unique swiglu variant. The **120B** model was trained with about **2.1 million H100 GPU hours**. Meanwhile, **Anthropic** launched **claude-4.1-opus**, touted as the best coding model currently. **DeepMind** showcased **genie-3**, a realtime world simulation model with minute-long consistency. The releases highlight advances in open-weight models, reasoning capabilities, and world simulation. Key figures like **@sama**, **@rasbt**, and **@SebastienBubeck** provided technical insights and performance evaluations, noting strengths and hallucination risks.

Canonical issue URL

They put the Open back in OpenAI!

AI News for 8/4/2025-8/5/2025. We checked 12 subreddits, 544 Twitters and 29 Discords (227 channels, and 8121 messages) for you. Estimated reading time saved (at 200wpm): 615 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Day 2 in what is expected to be the most packed AI news week of the year, 3 of the big labs all announced models that would individually have taken title story.

First the most (unintentionally) leaked launch: OpenAI's new open weights GPT-OSS models bring o4-mini class reasoning to your desktop (60GB GPU) and phone (12GB), which you can test in the new gpt-oss playground:

The model card and the research blog are worth a browse. The models also debut the harmony response format (open sourced) which update the old school ChatML with new concepts like message "channels":

On the same day, Anthropic also released Claude 4.1 Opus (blog), which was also leaked. This should be the best coding model in the world.... for now.

Finally, DeepMind's Genie 3 showed off extremely impressive realtime world simulation with navigation and minute-long consistency, but in classic Genie fashion, you'll just have to take their word that the demonstrated videos aren't cherrypicked.


AI Twitter Recap

OpenAI's gpt-oss Open-Weight Model Release

Major Model & Product Releases (Non-OpenAI)

AI Safety, Benchmarking, and Evaluation

Industry News, Tooling, & Broader Implications

Humor/Memes


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. OpenAI GPT-OSS Model Releases, Integrations, and Community Discussion

2. KittenTTS: Ultra-Compact TTS Model Launch

3. Llama.cpp Feature Updates and MoE Offloading

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Google DeepMind Genie 3 Model Release & Benchmarks

2. OpenAI Open Source Model and GPT-OSS Launch

3. Qwen-Image and Open-Source Multimodal Generation Benchmarks


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1. OpenAI's GPT-OSS Release Ignites Widespread Debate

Theme 2. New Models from Anthropic, Google, and Others Flood the Market

Theme 3. Developer Ecosystem Tools and Frameworks Evolve

Theme 4. AI Benchmarking and Novel Applications

Theme 5. Hardware Havoc and Performance Tuning


Discord: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LMArena Discord


LM Studio Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


Latent Space Discord


Nous Research AI Discord


Eleuther Discord


Moonshot AI (Kimi K-2) Discord


Cursor Community Discord


HuggingFace Discord


Yannick Kilcher Discord


Notebook LM Discord


Modular (Mojo đŸ”„) Discord


GPU MODE Discord


MCP (Glama) Discord


aider (Paul Gauthier) Discord


DSPy Discord


LlamaIndex Discord


Manus.im Discord Discord


LLM Agents (Berkeley MOOC) Discord


Torchtune Discord


tinygrad (George Hotz) Discord


Cohere Discord


MLOps @Chipro Discord


Codeium (Windsurf) Discord


Nomic.ai (GPT4All) Discord


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #announcements (1 messages):

kesku: https://x.com/perplexity_ai/status/1952532113095643185 <@&1105626802732404746>


Perplexity AI ▷ #general (1206 messagesđŸ”„đŸ”„đŸ”„):

Comet Browser, OpenAI OSS Model, Claude 4.1 Opus, Perplexity Search Ranking, GPT-5 Release Speculation


Perplexity AI ▷ #sharing (5 messages):

Youzu AI e-commerce, Room Visualizer, Youzu Lens, Google Genie AI


Perplexity AI ▷ #pplx-api (6 messages):

Sonar API, Perplexity Docs


Unsloth AI (Daniel Han) ▷ #general (1200 messagesđŸ”„đŸ”„đŸ”„):

Unsloth Quantization Requests, Nvidia Nemotron Super 49B 1.5, Diffusion Based Quantization Paper, GPT-OSS Model Analysis, GPU Recommendations for Training


Unsloth AI (Daniel Han) ▷ #introduce-yourself (4 messages):

Software Engineer seeking new opportunities, AI Engineer specializing in voice agents and chatbots


Unsloth AI (Daniel Han) ▷ #off-topic (7 messages):

VITS Male Voice Issues, Dataset size issues with voice models, Speaker Dimension Problems, RVC model


Unsloth AI (Daniel Han) ▷ #help (77 messagesđŸ”„đŸ”„):

GGUF Exporting Issues, TRL Compatibility with Unsloth, GRPO Batching and Chunking, SFTTrainer with Completion Only Loss, Qwen3-Coder Chat Template


Unsloth AI (Daniel Han) ▷ #showcase (3 messages):

Gemma 3, LlamaTale, QuixiAI


Unsloth AI (Daniel Han) ▷ #research (8 messagesđŸ”„):

Prompt variations for fine-tuning, LLM domain specific language framework, Token Decoder Maps GitHub project, Open Evolutionary Agents Blogpost


Unsloth AI (Daniel Han) ▷ #unsloth-bot (82 messagesđŸ”„đŸ”„):

SFTTrainer columns, Llama 3 fine-tuning errors, dtype check steps, LoRA Model generation, OpenAI OSS model


LMArena ▷ #general (1152 messagesđŸ”„đŸ”„đŸ”„):

GLM 4.5 is a beast, Kaggle Game Arena AI chess exhibition tournament, Long context reasoning benchmark, GPT-5 release, OpenAI open source models limitations


LMArena ▷ #announcements (1 messages):

New Models, GPT-OSS, Claude Opus


LM Studio ▷ #announcements (1 messages):

OpenAI gpt-oss models, LM Studio 0.3.21 (b4) update


LM Studio ▷ #general (593 messagesđŸ”„đŸ”„đŸ”„):

LM Studio + Libre Chat Speed, Tailscale IP Setup with LM Studio, Note Taking Tools Integration with LLMs, OpenAI's New GPT-OSS Models, Hardware for Running LLMs


LM Studio ▷ #hardware-discussion (32 messagesđŸ”„):

Page File on Windows, NVMe Offload, CUDA Runtime for Multiple GPUs, Computer for Blender + ComfyUI + LLMs, Storage Device for LLMs


OpenAI ▷ #annnouncements (4 messages):

OpenAI Open Models, Open Model Hackathon, Red Teaming Challenge, Inference Credits for Students


OpenAI ▷ #ai-discussions (286 messagesđŸ”„đŸ”„):

GPT-5 Release Speculation, AI and Education, OpenAI GPT-OSS Model, AI in Art, Local AI Models


OpenAI ▷ #gpt-4-discussions (48 messagesđŸ”„):

Spanish Language Bias in GPT, GPT's Linguistic Training, GPT 5 release, OCR issues with PDFs


OpenAI ▷ #prompt-engineering (1 messages):

GPT Subscription, GPT Subscription Glazing, User Perception of GPT Flattery


OpenAI ▷ #api-discussions (1 messages):

GPTs Agents, GPT Subscriptions


OpenRouter (Alex Atallah) ▷ #announcements (5 messages):

Anthropic Opus 4.1, OpenAI returns to Open Source, GPT-OSS models


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

gardasio: ChatGPT.com https://x.com/Gardasio/status/1952501913586442541


OpenRouter (Alex Atallah) ▷ #general (251 messagesđŸ”„đŸ”„):

Model vs Models Prioritization, Gemini video understanding, Claude providers caching, Qwen-image model, GPTs agents training


OpenRouter (Alex Atallah) ▷ #new-models (3 messages):

``


OpenRouter (Alex Atallah) ▷ #discussion (30 messagesđŸ”„):

LLM Emotional Understanding Benchmarks, EQ Benchmark, Gemma 3 27b, OCR Engine Comparisons, Sonnet Self-Moderation


Latent Space ▷ #ai-general-chat (236 messagesđŸ”„đŸ”„):

Claude Code deleting package-lock.json, Google's LangExtract library, AI Wrappers, Reflection AI fundraising, Kaggle Game Arena


Nous Research AI ▷ #general (220 messagesđŸ”„đŸ”„):

Qwen-Image, Text Encoding, XBai-o4 Scaling, Terminal Benchmarks, Attention sinks


Nous Research AI ▷ #ask-about-llms (2 messages):

Opus Bots, Improved Bot Design


Nous Research AI ▷ #research-papers (1 messages):

OpenAI GPT OSS Model Card


Nous Research AI ▷ #research-papers (1 messages):

GPT-OSS Model Card


Eleuther ▷ #general (14 messagesđŸ”„):

Moderator Logs, Scaling LLMs, YaRN Usage, Algorithmic Monoculture


Eleuther ▷ #research (188 messagesđŸ”„đŸ”„):

Residual rephrasing optimization trick, HRM Stability, Scaling HRMs, Deep Equilibrium Models, OpenAI's GPT OSS 20B and 120B models


Eleuther ▷ #lm-thunderdome (1 messages):

lm-evaluation-harness, API call


Eleuther ▷ #gpt-neox-dev (7 messages):

PP=0 layer naming, TE benchmarking, TE with rmsnorm


Moonshot AI (Kimi K-2) ▷ #general-chat (186 messagesđŸ”„đŸ”„):

K2 vs. O3, Stardew Valley, Kaggle, Game Arena, LLMs playing Chess and Go


Cursor Community ▷ #general (177 messagesđŸ”„đŸ”„):

Claude Sonnet vs Gemini, Cursor PDF support, Cursor's Yearly Subscription, Cursor Freezing Issues, Vercel's v0 on Cursor


Cursor Community ▷ #background-agents (7 messages):

Background agent failure, Request IDs sent via PM, Configure background agents to docker login``


HuggingFace ▷ #announcements (1 messages):

Trackio Experiment Tracking, New OCR Datasets, Transformers Acceleration, HF Jobs for Compute, Faster HF CLI


HuggingFace ▷ #general (113 messagesđŸ”„đŸ”„):

ZeroGPU, 340M t2i model, VS Code GPU acceleration, Soft-bias(es), RAG frameworks


HuggingFace ▷ #today-im-learning (2 messages):

AI Benchmark for LLMs playing Monopoly Deal, Learning Go and DRL


HuggingFace ▷ #cool-finds (2 messages):

DealBench, Qwen Image Model


HuggingFace ▷ #i-made-this (7 messages):

Open Evolutionary Agents, Recursive Thought Processes in AI, Critique of AI Benchmarks, GPT-OSS Multilingual Reasoner Tutorial


HuggingFace ▷ #reading-group (1 messages):

Reading Group Intro, Welcome Newbie


HuggingFace ▷ #NLP (1 messages):

Text Data Processing, Information Extraction


HuggingFace ▷ #smol-course (2 messages):

Inference Providers, Colab Error, Batman Party Music


HuggingFace ▷ #agents-course (8 messagesđŸ”„):

Assignment Submission Issues, Course Starting Point, Course Certificates


Yannick Kilcher ▷ #general (69 messagesđŸ”„đŸ”„):

Voice Cloning on a Budget, Claude Code's Data Leakage Incident, Deepseek-R1 vs Kimi-K2/GLM4.5, Attention Sink Layers, Gemini 2.5 Pro Attends to 1 Hour of Video


Yannick Kilcher ▷ #paper-discussion (13 messagesđŸ”„):

Tiny model does well on ARC-AGI, Cold reading on scilent paper, Deepmind Genie 3, Genie and SIMA papers


Yannick Kilcher ▷ #ml-news (16 messagesđŸ”„):

Windsurf IDE, Genie Model, Claude Opus 4, GPT-OSS, Natively Quantized Models


Notebook LM ▷ #use-cases (9 messagesđŸ”„):

Whisper Transcription, NotebookLM Limits, Customized Prompts


Notebook LM ▷ #general (65 messagesđŸ”„đŸ”„):

Video Overview Rollout, Custom Instructions and Podcast, Image Upload Issues, NotebookLM vs Gemini, Data Privacy in NotebookLM


Modular (Mojo đŸ”„) ▷ #general (53 messagesđŸ”„):

Decorators in Mojo, Zig-style reflection in Mojo, JavaScript Runtime in Mojo


Modular (Mojo đŸ”„) ▷ #announcements (1 messages):

Modular Platform 25.5, Large Scale Batch Inference, Standalone Mojo Conda packages, Open source MAX Graph API, MAX PyTorch integration


Modular (Mojo đŸ”„) ▷ #mojo (8 messagesđŸ”„):

Multiple AI Agents in Mojo, Mojo Code in Custom Frameworks, MAX as a Library


Modular (Mojo đŸ”„) ▷ #max (4 messages):

Modular Compatibility with Intel OSX, Apple Silicon Requirement for macOS, Docker Ubuntu Container Workaround


GPU MODE ▷ #general (21 messagesđŸ”„):

CUDA vs Compute Shaders, MXFP4, OpenAI Model U8 vs FP4, H100 FP4 Support, MoE Module Experts


GPU MODE ▷ #triton (6 messages):

Helion, Compiler Explorer Support for Triton, Triton Puzzles, GPU kernels turning into JavaScript


GPU MODE ▷ #cuda (5 messages):

CUDA programming, Compute shaders, CUDA kernels, cudaLaunchCooperativeKernel


GPU MODE ▷ #announcements (1 messages):

NCCL, Multi GPU programming, GPU communication tools and libraries, Quartet with 4 bit training, PCCL and designing fault tolerant communication primitives


GPU MODE ▷ #cool-links (1 messages):

as_ai: https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/


GPU MODE ▷ #jobs (1 messages):

NVIDIA, AI, HPC, Solution Architect, Universities


GPU MODE ▷ #jax (1 messages):

``


GPU MODE ▷ #torchao (9 messagesđŸ”„):

NVFP4 scales swizzling, FP4 Training development in TorchAO, MXFP4 training, NVFP4 training, User API for FP4


GPU MODE ▷ #irl-meetup (4 messages):

PyTorch conference, Meetup


GPU MODE ▷ #self-promotion (3 messages):

Langflow on Vast.ai, Tiny TPU on TinyTapeout


GPU MODE ▷ #factorio-learning-env (7 messages):

VQA Dataset, Factorio RCON, JackHopkins RCON


GPU MODE ▷ #cutlass (2 messages):

CuTe Vectorized Store Issues, Memory Coalescing Problems, LDG.E.128 vs STG.E.128 Instructions


GPU MODE ▷ #singularity-systems (2 messages):

Chaitin-Briggs-Click register allocation, picograd and picocuda merge


MCP (Glama) ▷ #general (15 messagesđŸ”„):

MCP for Docs, Standardized Payments for Tools, In-Browser postMessage Transport Proposal, MCP for Embedded Systems, Exposing Prompts via the Web


MCP (Glama) ▷ #showcase (4 messages):

API Keys, AutoGen Chatbot, MCP Servers, YouTube Search


aider (Paul Gauthier) ▷ #general (14 messagesđŸ”„):

Aider, Best Models for Aider, DeepSeek, OpenAI open models, GLM air stable


aider (Paul Gauthier) ▷ #questions-and-tips (2 messages):

Aider Non-Interactive Mode


DSPy ▷ #show-and-tell (6 messages):

Document boundary detection in PDFs, Knowledge graph practitioners meet DSPy, SIMBA Optimizer write-up


DSPy ▷ #general (9 messagesđŸ”„):

GEPA availability in DSPy, Optimized System Prompts for Fine-tuning, Three-Phase Training Approach


LlamaIndex ▷ #announcements (1 messages):

Office Hours


LlamaIndex ▷ #blog (4 messages):

LlamaParse, LlamaCloud, GPT-OSS-120B & GPT-OSS-20B, Document Agents


LlamaIndex ▷ #general (7 messages):

Tracing OpenAI Embedding API calls in LlamaIndex, LlamaExtract P&ID Example, LlamaExtract with Graphs Challenges, Graphiti with LlamaIndex for Knowledge Graph Apps


Manus.im Discord ▷ #general (10 messagesđŸ”„):

Manus is dead, Sub agents as a solution, TradingView Premium, Flutter app guide


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (8 messagesđŸ”„):

Syllabus Changes, LLM Agents vs Advanced LLM Agents, Speaker Differences


Torchtune ▷ #papers (4 messages):

Public Server, Sharing information


tinygrad (George Hotz) ▷ #general (2 messages):

TinyPilot, Codebase Work, Image Analysis


Cohere ▷ #👋-introduce-yourself (2 messages):

AI Voice Agents, GPT-Powered Chatbots, RAG Implementation, Freelance AI Engineer


MLOps @Chipro ▷ #general-ml (1 messages):

Search logs and click-through data, Ranker fine-tuning, Cost implications of data usage