Frozen AI News archive

OpenAI fires back: GPT-5.1-Codex-Max (API) and GPT 5.1 Pro (ChatGPT)

**OpenAI** released **GPT-5.1-Codex-Max**, featuring compaction-native training, an "Extra High" reasoning mode, and claims of over 24-hour autonomous operation, showing significant performance gains on benchmarks like METR, CTF, and PaperBench. **Google's Gemini 3 Pro** demonstrates strong coding and reasoning capabilities, achieving new state-of-the-art results on SWE-bench Verified and WeirdML, with estimated model size between 5-10 trillion parameters. The AI coding agent ecosystem is rapidly evolving with integrations and tooling improvements from multiple companies. **Sam Altman** highlighted the significant improvements in GPT-5.1-Codex-Max. The news also covers educational offerings like ChatGPT for Teachers and multi-agent workflows involving Gemini 3, GPT-5.1-Codex-Max, and Claude Sonnet 4.5.

Canonical issue URL

I can't keep up anymore

AI News for 11/18/2025-11/19/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (205 channels, and 11113 messages) for you. Estimated reading time saved (at 200wpm): 790 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Ahead of AIE CODE tomorrow, the coding model refreshes are coming in strong and fast - OpenAI followed yesterday's Gemini 3 drop with an upgraded/updated GPT-5.1-Codex (to be fair, OpenAI did say that this release was preplanned, implying it is not a reaction to Gemini). The automated summary links below from GPT 5.1 are good enough so we aren't touching them, but we would highlight the updated METR Evals which show a HUGE jump in autonomy:

A graph showing the time-horizon of software engineering tasks and how long different AI models can complete 50% of those tasks across various release dates.

as well as extra performance under a new "xhigh" param...

A line graph showing the performance of GPT-5.1-Codex and GPT-5.1-Codex-Max

OpenAI’s GPT‑5.1‑Codex‑Max and the coding‑agent arms race


AI Twitter Recap

Google’s Gemini 3: model capability, safety, IDEs, and UI

OpenAI’s GPT‑5.1‑Codex‑Max and the coding‑agent arms race

Meta’s SAM 3 and SAM 3D

Agent platforms and enterprise adoption

Infra and open-source: MoE, retrieval, and embodied systems

Benchmarks and research to watch

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Ollama Pricing and Open-Source Debate

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Google Gemini 3 Model Capabilities and Achievements

2. Humorous and Satirical Takes on AI Developments

3. ChatGPT Unusual Behaviors and User Experiences


AI Discord Recap

A summary of Summaries of Summaries by gpt-5.1

1. Gemini 3 And Frontier Models: Benchmarks, Coding, And Quirks

2. New GPU Kernels, Sparsity Tricks, And Communication Primitives

3. Inference, Fine-Tuning, And Evaluation: GPT‑OSS‑20B, Unsloth, And Determinism

4. AI Coding Tooling, IDEs, And Pricing Turbulence

5. New Vision And Agent Systems: SAM 3, Atropos+Tinker, Miles, Agentic Finance


Discord: High level Discord summaries

LMArena Discord


Perplexity AI Discord


BASI Jailbreaking Discord


Unsloth AI (Daniel Han) Discord


Cursor Community Discord


LM Studio Discord


OpenAI Discord


OpenRouter Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


Nous Research AI Discord


Eleuther Discord


Moonshot AI (Kimi K-2) Discord


HuggingFace Discord


Yannick Kilcher Discord


Latent Space Discord


DSPy Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


Manus.im Discord Discord


Windsurf Discord


MCP Contributors (Official) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1210 messages🔥🔥🔥):

Gemini 3 vs Grok, Gemini 3 limitations, AGI timelines, Nano Banana Pro, Gemini 3 image generation


LMArena ▷ #announcements (1 messages):

WebDev Arena Leaderboard, Cogito-v2.1 Model


Perplexity AI ▷ #announcements (1 messages):

Perplexity Pro, Perplexity Max, build assets


Perplexity AI ▷ #general (1072 messages🔥🔥🔥):

Gemini 3 Pro, Comet issues, Payout issues, Video analysis, Data privacy


Perplexity AI ▷ #sharing (4 messages):

Virlo AI Case Study, Phonk Guide in French, Shareable Threads


Perplexity AI ▷ #pplx-api (4 messages):

API Billing, n8n Usage


BASI Jailbreaking ▷ #general (1154 messages🔥🔥🔥):

TylerDurdan710's Politcal Stance, Gemini 3 Pro, Snowden, Al Trading, Nvidia


BASI Jailbreaking ▷ #jailbreaking (504 messages🔥🔥🔥):

GPT Jailbreaking Prompts, Gemini 3.0 Jailbreak, GPT-5.1 and other AI models


BASI Jailbreaking ▷ #redteaming (23 messages🔥):

Bug Bounty Collaborations, Kernel Pseudo-Emulator Jailbreak for Local LLMs, AzureAI Chat Widget Testing, Security of AI Chat Functions


Unsloth AI (Daniel Han) ▷ #general (149 messages🔥🔥):

Google Colab in VS Code, GPT-OSS LoRA support in Unsloth, AWQ quantization, SGLang integration, Minimum VRAM for QLoRa training


Unsloth AI (Daniel Han) ▷ #introduce-yourself (1 messages):

User Introductions, Channel Content Policy Clarification


Unsloth AI (Daniel Han) ▷ #off-topic (522 messages🔥🔥🔥):

LoRA, RMVPE, Gemini 3.0, Claude 4.5 Sonnet, Data Quality


Unsloth AI (Daniel Han) ▷ #help (121 messages🔥🔥):

Tool calling in fine-tuned models, Hugging Face model updates, Troubleshooting vLLM errors, Fine-tuning dataset order, Qwen3 VL and Qwen2 VL failing to output bounding boxes


Unsloth AI (Daniel Han) ▷ #research (5 messages):

Deterministic AI


Cursor Community ▷ #general (795 messages🔥🔥🔥):

Cursor Pricing Model, Antigravity IDE, Gemini 3 Pro Performance, Student Program with Cursor, Rollbacks


LM Studio ▷ #general (214 messages🔥🔥):

Web search plugin, Intel Arc A770 Vulkan issues, Portable LM Studio Install, Qwen3-VL-30B-Instruct-1m performance, AMD MI60 GPUs for Inference


LM Studio ▷ #hardware-discussion (296 messages🔥🔥):

GPU pricing, Dell GPU setup issues, Motherboard Build, Solar setup for PC, Vulkan runtime issues


OpenAI ▷ #annnouncements (1 messages):

ChatGPT for Teachers, Free access until 2027, Admin controls for schools


OpenAI ▷ #ai-discussions (297 messages🔥🔥):

Gemini 3 Pro vs GPT-5.1, Gemini 3 and Content Filters, Grok Imagine, Assistants to Responses API migration, Potential for AI Mental Illness


OpenAI ▷ #gpt-4-discussions (5 messages):

GPT Photo Upload Errors, ZeroGPT flagging issues, Humanizers, Public GPTs


OpenAI ▷ #prompt-engineering (5 messages):

migration of assistants to responses api, Chat gpt 5.1 pro, eqbench Creative Writing v3


OpenAI ▷ #api-discussions (5 messages):

Migration of Assistants to Responses API, Chat GPT 5.1 Pro, GPT-5 vs GPT-4.5


OpenRouter ▷ #app-showcase (3 messages):

Heavy AI Model Launch, GPU usage, Model Availability


OpenRouter ▷ #general (239 messages🔥🔥):

Gemini 3, Sherlock Think vs Alpha, Rate Limits Errors with Chutes, Gemini 3 for frontend vs backend, 3D mesh objects with LLMs


OpenRouter ▷ #discussion (11 messages🔥):

Gemini 3, Reasoning Details, OpenAI Max Models, Cogito 2.1, Batching Embeddings


GPU MODE ▷ #general (16 messages🔥):

WGPU, ML Compiler Resources, MLIR, Horace blog, Halide paper


GPU MODE ▷ #triton-gluon (3 messages):

GB200 bring-up, Zero-init vs random-init, Power throttling, Matrix Multiplications


GPU MODE ▷ #cuda (10 messages🔥):

Cute DSL and SM12x, Thor's TMem, Texture Memory benefits, DMMA/HMMA


GPU MODE ▷ #torch (2 messages):

SemiAnalysis post


GPU MODE ▷ #cool-links (1 messages):

DGEMM Accuracy, Reduced Precision Tensor Cores, Ozaki Scheme


GPU MODE ▷ #jobs (1 messages):

AI in Automotive, Internship Opportunity, Autonomous Tech, Car Safety Automation, Telematics Company


GPU MODE ▷ #beginner (8 messages🔥):

NVIDIA accelerated computing hub course, Thrust library, CCCL (CUDA C++ Core Libraries), Model inference and optimization, Open source repos


GPU MODE ▷ #irl-meetup (2 messages):

Toronto Meetup, TSFM Event


GPU MODE ▷ #self-promotion (23 messages🔥):

MACKO-SpMV, Unstructured Weight Sparsity, GEMV, Koyeb Sandboxes


GPU MODE ▷ #🍿 (4 messages):

TUI kernel submission, CLI feedback, Popcorn CLI naming


GPU MODE ▷ #thunderkittens (1 messages):

TK Library, ipc/vmm


GPU MODE ▷ #edge (2 messages):

Qualcomm GPU vs NPU, Qualcomm Vulkan


GPU MODE ▷ #submissions (12 messages🔥):

NVIDIA leaderboard submissions, nvfp4_gemv benchmark


GPU MODE ▷ #cutlass (13 messages🔥):

Thread Value Layouts, CUTE DSL, Blackwell, fabs() function in CUTE DSL


GPU MODE ▷ #singularity-systems (3 messages):

Book Writing Process, Toronto Talk, sitp parts 1 and 2


GPU MODE ▷ #multi-gpu (1 messages):

DMA Collectives, ML Communication Offloads, AMD Instinct MI300X GPUs, RCCL communication collectives library


GPU MODE ▷ #helion (13 messages🔥):

Triton bug, Helion workarounds, Helion support for FP8 BMM, Helion inline Triton function


GPU MODE ▷ #nvidia-competition (24 messages🔥):

GitHub Status Down, Gemini 3 Pro Blackwell Confusion, HTML to Markdown Conversion, NVIDIA Documentation Restrictions, PTX Documentation Conversion to Markdown


GPU MODE ▷ #robotics-vla (14 messages🔥):

ManiSkill Internals, Open Source VLA Training Lib, Automated Scene Variation, VLM-Controlled Rollouts, Teleop in Simulation


Modular (Mojo 🔥) ▷ #general (2 messages):

Mojo, MAX framework, PyTorch, Basalt framework


Modular (Mojo 🔥) ▷ #mojo (142 messages🔥🔥):

Arc safety in Mojo, Indirect Origins, Garbage Collection (GC) in Mojo, Custom allocators, Heterogeneous memory and multi-device


Modular (Mojo 🔥) ▷ #max (1 messages):

Device Tracing, Perfetto Integration


Nous Research AI ▷ #announcements (1 messages):

Atropos RL Environments, Thinking Machines' Tinker Training API


Nous Research AI ▷ #general (95 messages🔥🔥):

Google Antigravity, Gemini 3 single shot realtime raytracer, Open Source, China alternative Strategy, Agentic A.I tools for financial traders, Deepmind Gemma 3


Nous Research AI ▷ #ask-about-llms (1 messages):

bird0861: persona vector moment


Nous Research AI ▷ #interesting-links (5 messages):

Atropos Tinker, Gemini Training Recipe, Negativity Bias


Eleuther ▷ #general (6 messages):

RE-Bench, Mallas, Alignment, Confidential Computing, Monomorphic Encryption, ML Trojans, JEPAS, long form fictional content


Eleuther ▷ #research (64 messages🔥🔥):

VWN A, B matrices clarification, Linear attention comparisons, Q-learning with algebraic topology, Zeckendorf bit lattices, Inference-time epistemics layer & hallucination reduction


Eleuther ▷ #scaling-laws (3 messages):

Approximate KNN, SETH implications on KNN, Subexponential 3-SAT


Eleuther ▷ #interpretability-general (3 messages):

Sparse MoEs vs Dense Models, SAE based methods, Interpretability based interventions


Eleuther ▷ #lm-thunderdome (7 messages):

Instruction Following Benchmarks, Text to SQL Tasks


Moonshot AI (Kimi K-2) ▷ #general-chat (72 messages🔥🔥):

Gemini 3, DeepSeek V4, GLM 5, Kimi K2 Thinking, AI assisted dataset creation


HuggingFace ▷ #general (56 messages🔥🔥):

Gemini 3, KTOtrainer memory usage, Hugging Face billing issues, Industry standards like MCP, ReLU activation function


HuggingFace ▷ #i-made-this (3 messages):

Fine-tuning OpenAI's reasoning model, TruthAGI.ai launch, pg_ask PostgreSQL extension


HuggingFace ▷ #computer-vision (1 messages):

Image Classifier, CNN vs. Vision Transformers, Multi-class and Multi-label image classification


HuggingFace ▷ #NLP (1 messages):

NLP, Named Entity Recognition (NER), Multilingual Models, Transformer Models


HuggingFace ▷ #smol-course (2 messages):

Introduction to the smol-course channel


HuggingFace ▷ #agents-course (1 messages):

wilecoyotte_77610: Is there a certification for the second unit of the fine tuning course ?


Yannick Kilcher ▷ #general (25 messages🔥):

Grok 4.1 Benchmarks, NeurIPS 2025 Meetup, Gemini 3 Speculation, AI CEO Benchmark, HuggingFace Xet Repository Setup


Yannick Kilcher ▷ #paper-discussion (8 messages🔥):

Yannic Kilcher videos on transformers, SAM 3: Segment Anything with Concepts


Yannick Kilcher ▷ #ml-news (23 messages🔥):

Gemini 3 Benchmarking, OpenAI Bailout, Math Review Dissatisfaction, Segment Anything Model 3, Cogito v2-1 Analysis


Latent Space ▷ #ai-general-chat (47 messages🔥):

Palantir Cost vs Customization, Cursor CLI vs Claude Code, SAM 3, GPT-5.1-Codex-Max, LMSYS Miles Enterprise RL Framework


DSPy ▷ #general (26 messages🔥):

LLMs are non-deterministic, GPT-OSS-20B determinism, DSPy in production community, Anthropic model on Azure via DSPy, DSPy for inference


aider (Paul Gauthier) ▷ #general (20 messages🔥):

Gemini 3 integration with Aider, Aider with Ollama, GPT-5.1 issues with Aider


tinygrad (George Hotz) ▷ #general (8 messages🔥):

Llama 1B benchmark in CI, CPU architecture considerations, Benchmarking torch.compile vs tinygrad, Kernel imports, CuTeDSL


tinygrad (George Hotz) ▷ #learn-tinygrad (2 messages):

tinygrad bug fix, lab troubleshooting


Manus.im Discord ▷ #general (7 messages):

Manus Credit System Changes, TiDB Cloud Access Issues, Gemini 3 Integration with Manus, AI Coding Education


Windsurf ▷ #announcements (2 messages):

Gemini 3 Pro, Windsurf, Software Releases


MCP Contributors (Official) ▷ #general (2 messages):

Image Attachments, Temporary Hiccups