Frozen AI News archive

ChatGPT Agent: new o* model + unified Deep Research browser + Operator computer use + Code Interpreter terminal

**OpenAI** launched the **ChatGPT Agent**, a new advanced AI system capable of browsing the web, coding, analyzing data, and creating reports, marking a significant step towards human-like computer use. The agent, distinct from and superior to **o3**, is considered the first public exposure of what was internally called **o4**, now merged into **GPTNext**. It features end-to-end reinforcement learning, can operate for extended periods (tested up to 2 hours), and is classified as "High" risk for biological misuse, with safeguards activated. Early benchmarks show mixed results, excelling in some tests like **WebArena** and **BrowserComp** but underperforming on others like **PaperBench**. Key figures involved include **Sam Altman**, **Greg Brockman**, and **Kevin Weil**, with technical insights from **xikun_zhang_** and risk commentary from **KerenGu** and **boazbaraktcs**. The launch sparked speculation about **GPT-5**, which was confirmed not to be the case.

Canonical issue URL

ChatGPT is all you need.

AI News for 7/16/2025-7/17/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (226 channels, and 9565 messages) for you. Estimated reading time saved (at 200wpm): 703 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

In a very well received, classic OpenAI style 10am PT livestream, Sama and team launched "ChatGPT agent" with a meme-worthy opener (sitll wasn't the top meme of today):

The blogpost, system card, system prompt, Wired and Every coverage, have focused on making slides, spreadsheets, research, customizability (including scheduled agents), and the HLE, FrontierMath benchmarks are of course great, but:

  1. we shouldn't let benchmark fatigue distract from the fact of how quickly models and agents are running up these extraordinarily difficult, already superhuman tests,
  2. most people are missing that "the model" referred to in the blogpost is a distinct new model separate from and better than o3 if you look carefully at the labels:

Similar to how Deep Research was the first product to publicly expose the full o3 anywhere, ChatGPT Agent seems to be the first product to publicly expose what would have been called o4, but is now being merged into GPTNext.


AI Twitter Recap

OpenAI ChatGPT Agent Launch

Model Releases, Performance & Benchmarks

AI Tooling, Frameworks & Infrastructure

AI Research, Papers & New Techniques

Companies, Ecosystem & Geopolitics

Humor/Memes


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Kimi K2 Model Leaderboard Rankings and OpenAI Comparison

2. Mistral Le Chat Feature Announcements and Improvements

3. LocalLlama Community Growth and Milestones

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. OpenAI ChatGPT Agent Release, Features, and Risk Discourse

2. Benchmarks & New Model Performance: ChatGPT Agent, Gemini, and Video/Editing Releases

3. Cultural and Existential AI Debates (Creativity, AGI, AI Impact Memes)


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1. The Agent Awakens: OpenAI's ChatGPT Agent Enters the Arena

Theme 2. The Business of AI: Valuations, Acquisitions, and Shutdowns

Theme 3. New Models & Major Updates Shake the Landscape

Theme 4. Under the Hood: The Nitty-Gritty of Model Optimization

Theme 5. Developer Ecosystem: New Tools and Community Tensions


Discord: High level Discord summaries

Perplexity AI Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


Cursor Community Discord


LMArena Discord


Latent Space Discord


OpenRouter (Alex Atallah) Discord


Eleuther Discord


LM Studio Discord


HuggingFace Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


Manus.im Discord Discord


MCP (Glama) Discord


tinygrad (George Hotz) Discord


Nous Research AI Discord


Notebook LM Discord


LLM Agents (Berkeley MOOC) Discord


Cohere Discord


LlamaIndex Discord


DSPy Discord


Codeium (Windsurf) Discord


MLOps @Chipro Discord


Nomic.ai (GPT4All) Discord


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1283 messages🔥🔥🔥):

Airtel Free Perplexity Pro, Perplexity Pro India, Comet Browser invite, New perplexity page, Ai waifus


Perplexity AI ▷ #sharing (2 messages):

CachyOS, Iron Rails and Ideals: Mao Zedong


Perplexity AI ▷ #pplx-api (5 messages):

Perplexity Pro, API access, Sonar models, Prompting, JSON output


OpenAI ▷ #annnouncements (3 messages):

ChatGPT Agent, Deep Research, Operator


OpenAI ▷ #ai-discussions (1172 messages🔥🔥🔥):

Grok app, Chat GPT for desktop, AI overlords, OpenAI's Agent/Operator, Mensa IQ Test


OpenAI ▷ #gpt-4-discussions (4 messages):

GPT Agents, ChatGPT website, LLM models


OpenAI ▷ #prompt-engineering (3 messages):

Reproducibility Elements, Prompt Templates, Model Interfaces and Calls, Tasks and Inputs, Evaluation Metrics


OpenAI ▷ #api-discussions (3 messages):

Reproducibility, Missing Reproducibility Elements, Prompt Templates, Model Interfaces and Calls, Tasks and Inputs


Unsloth AI (Daniel Han) ▷ #general (549 messages🔥🔥🔥):

Model performance within same family vs different families, Kimi model 1.8 bit usability, Swapping model architectures, Fine-tuning LLMs for educational purposes, ERNIE 4.5 MoE models support in llama.cpp


Unsloth AI (Daniel Han) ▷ #off-topic (2 messages):

Small Language Models, Low Compute Power Systems, Data Collection and Processing Jobs, Low Power Distributed Computing


Unsloth AI (Daniel Han) ▷ #help (228 messages🔥🔥):

Blackwell RTX 50 series and xformers, Qwen3-4B-Base training, Smartest model for 15GB VRAM, Unsloth optimizations on big VRAM GPUs, GGUF conversion logic rework


Unsloth AI (Daniel Han) ▷ #showcase (2 messages):

Unsloth fine-tuning, Osmosis-AI models, Model Accuracy on Benchmarks


Unsloth AI (Daniel Han) ▷ #research (6 messages):

LLM Hallucinations, Apple Intelligence, Sycophancy Impact


Unsloth AI (Daniel Han) ▷ #unsloth-bot (20 messages🔥):

Logprobs for tokens, Dataset preparation for Qwen3, Automatic early stopping in Unsloth


Cursor Community ▷ #general (568 messages🔥🔥🔥):

Cursor Pricing, MCP & Claude integration, Agent stuck, KIRO, Auto Model details


Cursor Community ▷ #background-agents (8 messages🔥):

Dockerfile NVM_DIR Issue, Agent stuck in Opening Remote state, Environment not rebuilding


LMArena ▷ #general (559 messages🔥🔥🔥):

DeepSeek Margin, OpenAI Browser Speculation, Kimi K2 coding, OpenAI Image editor API, GPT-5 Hype


Latent Space ▷ #ai-general-chat (195 messages🔥🔥):

ChatGPT Agent, Perplexity's Valuation, Mistral Le Chat, FAL Series C, Real-Time Diffusion Video


Latent Space ▷ #ai-announcements (1 messages):

YouTube Video Announcement


Latent Space ▷ #ai-in-action-club (96 messages🔥🔥):

ChatGPT Agent Launch, Benchmarks, Safety Concerns - Biohazards, Bespoke Operator-Mode Training, BBQ Evaluation


OpenRouter (Alex Atallah) ▷ #app-showcase (7 messages):

Kimi K2, GROQ, OpenRouter, Email Builder, FlowDown


OpenRouter (Alex Atallah) ▷ #general (258 messages🔥🔥):

Claude 4 Opus pricing and usage, GPTs Agents Learning, Free Models, Janitor AI and 401 errors, Chutes Free Tier Limits


OpenRouter (Alex Atallah) ▷ #discussion (11 messages🔥):

OpenRouter models in Cursor, Kluster.ai shuts down, AI inference services shutting down


Eleuther ▷ #general (47 messages🔥):

Research Management, ML Paper Writing Advice, Finding Research Mentors, Smallest Benchmark Datasets for LLMs, SOAR Program


Eleuther ▷ #research (79 messages🔥🔥):

latent space initialization for experts, ETHOS model updates, PEER paper discussion, Weight decay perturbation, MLA but for MOE


Eleuther ▷ #interpretability-general (3 messages):

SAE model data discrepancies, nnterp package beta release, Transformer models unified interface, Robust testing system for models, Model validation tests for hooks


Eleuther ▷ #lm-thunderdome (4 messages):

Harness Reproducibility, Dynamic IFEval Suite, bfloat16


Eleuther ▷ #gpt-neox-dev (20 messages🔥):

Transformer Engine setup issues, RoPE_Pct in gpt-neox, Slurm runner in DeeperSpeed, Containerized setup for gpt-neox


LM Studio ▷ #general (78 messages🔥🔥):

Speculative Decoding speed boost, Local Gemma threatening users, LM Studio Open Network Server setup, EOS token definition, MoE Model analysis


LM Studio ▷ #hardware-discussion (68 messages🔥🔥):

LM Studio multi CPU support, AMD Ryzen 9 8945H, 3090 vs 3080Ti Price, NPU use case


HuggingFace ▷ #general (66 messages🔥🔥):

HF repo PR watching, SmolVLM2 blogpost scam, Dataset-viewer API modality, Gender swapping AI, CAD-Editor model released


HuggingFace ▷ #today-im-learning (1 messages):

Model Training, 1.5 bit research


HuggingFace ▷ #cool-finds (2 messages):

GPUHammer exploit, LLM Hallucination


HuggingFace ▷ #i-made-this (4 messages):

LunarisCodex LLM, GitChameleon eval benchmark for LLMs, SuccubusBot Text Coherence Model, Flame Audio AI toolkit


HuggingFace ▷ #computer-vision (2 messages):

SmolDocLing finetuning issues, Symmetry-agnostic image similarity models


HuggingFace ▷ #agents-course (2 messages):

HuggingFace Inference API, LLMs Deployed via HF Inference


GPU MODE ▷ #general (12 messages🔥):

shfl_down_sync, reduction intrinsics, warp reduce functions, kernel optimization


GPU MODE ▷ #triton (9 messages🔥):

Triton Autodiff, sm120 GPUs for fp4 ops, tl.constexpr_function decorator, einops package for triton


GPU MODE ▷ #torch (2 messages):

Inductor problems, Blackwell GPU issues


GPU MODE ▷ #algorithms (1 messages):

kszysiu2137: Quad tree maybe


GPU MODE ▷ #cool-links (3 messages):

NVIDIA CUDA Kernel Fusion in Python, AMD's response to CUDA, Triton as an alternative to CUDA


GPU MODE ▷ #jobs (1 messages):

Storage Engineer, Remote Job


GPU MODE ▷ #beginner (3 messages):

vast.ai, GPU programming opportunities, CUDA speedup, Bioinformatics


GPU MODE ▷ #rocm (1 messages):

Compiler behavior, Builtins, asm volatile, llvm.amdgcn.raw.buffer.store.i128


GPU MODE ▷ #submissions (1 messages):

A100 Speed


GPU MODE ▷ #hardware (6 messages):

Coreweave GB300 NVL72 Availability, Nvidia Hardware Prioritization, DGX vs HGX, B200 Availability & Liquid Cooling, Voltage Park Solutions Engineer


GPU MODE ▷ #factorio-learning-env (3 messages):

MCTS gym_env integration, Factory rollouts, Visual encoder


GPU MODE ▷ #cutlass (7 messages):

Jetson Orin, Jetson Thor, CuteDSL, tv_layout swaps


GPU MODE ▷ #singularity-systems (2 messages):

Scheduling


Modular (Mojo 🔥) ▷ #general (2 messages):

Greetings


Modular (Mojo 🔥) ▷ #mojo (21 messages🔥):

parameter functions and closures, Q3 Roadmap: Unified @parameter and runtime closures, copyinit__ for escaping values, DynStringable, merge various known origins


Modular (Mojo 🔥) ▷ #max (18 messages🔥):

PyTorch Custom Ops with MAX Graph, Benchmarking Issues with Max-24.6, CUDA OOM Errors, LTS Release Support


Yannick Kilcher ▷ #general (29 messages🔥):

Zuckerberg AI Talent Acquisition, Chicken Tender Inflation, OpenAI benchmark comparisons, Grok 4 HLE score


Yannick Kilcher ▷ #paper-discussion (2 messages):

``


Yannick Kilcher ▷ #ml-news (5 messages):

Gaussian Splatting, General Analysis iMessage Stripe Exploit


Manus.im Discord ▷ #general (22 messages🔥):

Manus Alternatives, Manus chat down?, File Zipping Advice, Custom Data Sources in Manus


MCP (Glama) ▷ #general (18 messages🔥):

Anthropic Payment Issues, Domain Name Checking MCP Server, Needle MCP Server Introduction, OAuth vs API Keys for MCPs, Brave's Official MCP Server


MCP (Glama) ▷ #showcase (3 messages):

Vibe Coding Survey, Adaptive RAG MCP Server, Generator Checkpoint, Microsoft NextCoder


tinygrad (George Hotz) ▷ #general (2 messages):

ShapeTracker parameter to ASSIGN UOp


tinygrad (George Hotz) ▷ #learn-tinygrad (18 messages🔥):

tinygrad documentation for beginners, NVIDIA GPU driver issues with tinygrad and WSL2, Muon optimizer in tinygrad, Switching from WSL2 to native Ubuntu


Nous Research AI ▷ #announcements (1 messages):

Atropos, RL Environments Framework


Nous Research AI ▷ #general (18 messages🔥):

Proto-agentic XML tag adherence, Hermes Documentation, Open Source Models vs US Models, Ethical Considerations in AI, Learning ML


Nous Research AI ▷ #ask-about-llms (1 messages):

Model Context Size, Letta Personas, Model Evaluation


Notebook LM ▷ #use-cases (4 messages):

uBlock browser extension, notepad.exe, NotebookLM folders/subfolders


Notebook LM ▷ #general (14 messages🔥):

Service Unavailable Error, NotebookLM Use Cases, Textbook Integration with NotebookLM, NotebookLM Enterprise & GCP Integration


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Agentic AI Summit 2025, LLM Agents MOOC, UC Berkeley, Khosla Ventures, Nvidia


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (8 messages🔥):

Fall Semester Updates, Certificate Declaration Form, Berkeley RDI Newsletter


Cohere ▷ #🧵-general-thread (1 messages):

sma.bari.shafin: btw, how will we get the certificates of the Community Summer School?


Cohere ▷ #👋-introduce-yourself (4 messages):

DNNs for Time Series, ML in Data Science Education, ML for Real-World Problems, Interests in ML Domains


LlamaIndex ▷ #blog (2 messages):

Human-in-the-loop agents, LlamaParse one-click table extraction


LlamaIndex ▷ #general (1 messages):

beastx2: <@334536717648265216> heyy


DSPy ▷ #general (3 messages):

DSPy creative applications, Lean 4 verification, Story generation, Roleplay prompt optimization


Codeium (Windsurf) ▷ #announcements (2 messages):

Claude Sonnet 4, Discounted Credit Rate, Windsurf Wave 11, Acquisition by Cognition, Voice Mode


MLOps @Chipro ▷ #events (1 messages):

AI-Native Data Infrastructure, Task-Specific Data Discovery, Secure Autonomous Access, Production-Scale Performance


Nomic.ai (GPT4All) ▷ #general (1 messages):

Web3 and AI, AI agents and multi-agent systems, Automation workflows, NLP apps and chatbots, Voice & speech integration