Frozen AI News archive

GPT-5.2 (Instant/Thinking/Pro): 74% on GDPVal, 1.4x cost of GPT 5.1, on 10 Year OpenAI Anniversary

**OpenAI** celebrates its 10 year anniversary with the launch of **GPT-5.2**, featuring significant across-the-board improvements including a rare 40% price increase. GPT-5.2 shows strong performance gains in scientific reasoning, knowledge work, and economic value tasks, achieving over **70.9%** human expert parity on **GDPval** tasks and reaching **90.5%** on ARC-AGI-1 with a large efficiency gain. Despite some mixed results in coding benchmarks and vision capabilities, GPT-5.2 is well received as a major update with extended context and tiered reasoning controls. Pricing is set at **$1.75/M input** and **$14/M output** tokens with a 90% cache discount. The update is live in ChatGPT and API, marking a significant milestone for OpenAI's LLM development.

Canonical issue URL

OpenAI is all you need.

AI News for 12/10/2025-12/11/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (205 channels, and 8080 messages) for you. Estimated reading time saved (at 200wpm): 592 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

It is the 10 year anniversary of OpenAI today, and the company celebrated by launching a well received update in GPT 5.2 (blog, docs, system card). Although coming at a very rare 40% price increase, it is an across the board, sometimes very very large, improvement:

Performance comparison of GPT-5.2 Thinking across various benchmarks and tasks, showing significant improvements in scientific reasoning, knowledge work, an

We have been complimentary of GDPVal before and the jump to 74.1% on economically valuable tasks: "GPT‑5.2 Thinking produced outputs for GDPval tasks at >11x the speed and <1% the cost of expert professionals, suggesting that when paired with human oversight, GPT‑5.2 can help with professional work."

A bar graph comparing GPT model performance on knowledge work tasks, showing win rates for different GPT versions against expert-level performance.

Last month's 5.1 Codex Max's new xhigh param strugled on SWE-Bench Pro (vs SWE-Bench Verified as reported in it's own blogpost), and now 5.2 Thinking xhigh works again.

Performance comparison of GPT-5.2 models on software engineering benchmarks, showing accuracy improvements across different model variants and output token ranges.

Long Context utilization is also another highlight, with many noticing the MRCR improvement:

A line graph comparing GPT-5.2 and GPT-5.1 thinking performance across different input token lengths, showing a gra

Not everything is perfect - it still gets the number of R's in strawberry wrong, and although it makes pretty spreadsheets, the numbers do not pass a simple sanity check, and even the touted vision improvement is acknowledged to not be perfect and surpassed by Gemini 3.

Overall, still a very good reception to probably the last big American LLM update of the year.


AI Twitter Recap

OpenAI’s GPT‑5.2 release: capability, evals, pricing, and integrations

Google’s Interactions API and Gemini Deep Research agent

Agents on devices and developer UX

Search/RAG and inference infra

Quantitative guidance for multi‑agent systems

Ecosystem moves: media, research, hiring

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Model Context Window Enhancements

2. Live Model Switching in llama.cpp

3. Meta's AI Strategy Satire

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. GPT-5.2 Performance and Criticism

2. AI Model Bugs and Quirks

3. AI Industry Developments and Investments


AI Discord Recap

A summary of Summaries of Summaries by gpt-5.2

1. GPT-5.2 Launch: Benchmarks vs Reality

2. Dev Tooling UX: IDE Agents, MCPs, and Reliability

3. Training & Efficiency: Unsloth Packing, LoRA Reality, and Cheap GPUs

4. Infra & Kernel Land: CUDA 13, ROCm SymMem, and Microsecond Bragging Rights

5. Open Ecosystem Demos & Eval Gotchas: WebGPU Voice, ASR, and Harness Limits


Discord: High level Discord summaries

LMArena Discord


Cursor Community Discord


Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


BASI Jailbreaking Discord


OpenAI Discord


OpenRouter Discord


LM Studio Discord


Eleuther Discord


Nous Research AI Discord


GPU MODE Discord


HuggingFace Discord


Yannick Kilcher Discord


Latent Space Discord


Moonshot AI (Kimi K-2) Discord


Manus.im Discord Discord


tinygrad (George Hotz) Discord


aider (Paul Gauthier) Discord


MCP Contributors (Official) Discord


DSPy Discord


MLOps @Chipro Discord


Windsurf Discord


The Modular (Mojo 🔥) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1414 messages🔥🔥🔥):

GPT 5.2 High vs Gemini 3 Pro, GPT-5.2 launch, MovementLabs custom chip, OAI & Disney Partnership, Extra High Model is expensive


LMArena ▷ #announcements (2 messages):

November Code Arena Contest, GPT-5.2, GPT-5.2-high, WebDev Leaderboard


Cursor Community ▷ #general (1018 messages🔥🔥🔥):

Context Compaction and Rewinding, Cursor Re-indexing, Student Account Verification, Deepseek Integration, GPT-5.2 Discussion


Perplexity AI ▷ #announcements (1 messages):

GPT-5.2


Perplexity AI ▷ #general (1070 messages🔥🔥🔥):

Grok 4.20, Perplexity rate limits, GPT 5.2 release and performance, Comet agent limitations, Max plan value


Perplexity AI ▷ #sharing (1 messages):

Substack Notes Sharing, AI models, Fundraising


Perplexity AI ▷ #pplx-api (2 messages):

API Usage, Labs Testing, Online Availability


Unsloth AI (Daniel Han) ▷ #general (449 messages🔥🔥🔥):

GPU requirements for training, Fine-tuning vs Prompting, Analyzing TEDx talks, Unsloth's New Packing Release


Unsloth AI (Daniel Han) ▷ #introduce-yourself (2 messages):

TinyLLMs, MLOPs, Orchestration, Fine-tuning LLMs, Pocketflow


Unsloth AI (Daniel Han) ▷ #off-topic (675 messages🔥🔥🔥):

Data annotation, The AI watermark, DPO data batch size


Unsloth AI (Daniel Han) ▷ #help (22 messages🔥):

LoRA rank effect on LLM performance, Unsloth Transformers v5 support, UnslothGRPOTrainer processor calls, Unsloth dependency conflict resolution, Unsloth multi-GPU training error


Unsloth AI (Daniel Han) ▷ #showcase (7 messages):

Unsloth Embedding Models, PR for Embedding Model Integration, Blogpost Collaboration


Unsloth AI (Daniel Han) ▷ #research (4 messages):

OpenAI new paper, GPT 5.2 release


BASI Jailbreaking ▷ #general (526 messages🔥🔥🔥):

Grok Censorship, Local NSFW Models, Protecting Books from AI Copies, CIRIS Agent Jailbreak, GPT 5.2 Jailbreak


BASI Jailbreaking ▷ #jailbreaking (95 messages🔥🔥):

Azure OpenAI GPT-4o jailbreak, Gemini Pro jailbreak, Deepseek jailbreak prompts, GPT jailbreaks and stability issues, 4chan rulez


BASI Jailbreaking ▷ #redteaming (4 messages):

Introductions, Discord Channel Activity


OpenAI ▷ #annnouncements (3 messages):

Cybersecurity AI, GPT-5.2


OpenAI ▷ #ai-discussions (458 messages🔥🔥🔥):

Mac Studio RAM, character.ai, Sora 2 Pro Plan, AI Weekly Meetings, GPT 5.2 release


OpenAI ▷ #gpt-4-discussions (14 messages🔥):

GPT-5.2, Sora 2 Pro, GitHub Copilot Tool Call Support, GPT-OSS models, Gemini 3 Pro


OpenAI ▷ #prompt-engineering (7 messages):

Prompt Engineering Framework, Rubric Refactoring, Industrial Revolution vs. Neomodernist City, LLM Prompt Structuring


OpenAI ▷ #api-discussions (7 messages):

Prompt Engineering Framework, Rubric Refactoring, Prompt Lessons with LLM


OpenRouter ▷ #announcements (1 messages):

GPT-5.2, Tool Calling, Coding Agents, Long Context Performance, OpenRouter Credits


OpenRouter ▷ #app-showcase (1 messages):

llumen, Deep Research Mode, Image Generation, Cross-Tab Syncing


OpenRouter ▷ #general (410 messages🔥🔥🔥):

DeepSeek caching vs Grok, Qwen models, Gemini 3 Flash, Chutes provider, GPT 5.2 released


OpenRouter ▷ #new-models (3 messages):

``


OpenRouter ▷ #discussion (49 messages🔥):

GPT-5.2, Robin Model, Garlic model, Mistral new model, Openrouter integration


LM Studio ▷ #general (264 messages🔥🔥):

Chinese LLM download, LM Studio performance, 5090 vs 4070 Ti, Qwen3 coder, Deepseek r2


LM Studio ▷ #hardware-discussion (149 messages🔥🔥):

VL-4B Performance, LFM 8B A1B, Zen 6, Laptop LLM, GPT-OSS


Eleuther ▷ #general (31 messages🔥):

EleutherAI's track record, OLMo-1 model differences, Log and Exp Activation Functions, Synthema and dynamic concepts


Eleuther ▷ #research (183 messages🔥🔥):

ARC-AGI Project, gzip llm, sandwich norms, diffusion models, CFG in LLMs


Eleuther ▷ #lm-thunderdome (1 messages):

HuggingFace Processor, Tokenizer Max Length, gemma3-12b Evaluation


Nous Research AI ▷ #general (93 messages🔥🔥):

405b models, Hugging Face, Unsloth speedup, Hetzner GPU server, GPT 5.2 Release?


Nous Research AI ▷ #ask-about-llms (10 messages🔥):

Nous Nomos and IMO, AI vs Internet Impact, AI Hype and Reality, MoE Urban Legends


GPU MODE ▷ #general (5 messages):

GPU sorting algorithms, Parallel Merge Sort, Sample Sort, Bitonic Sort, Boolean sorting


GPU MODE ▷ #triton-gluon (1 messages):

CUDA 13, Torch, vllm


GPU MODE ▷ #cuda (1 messages):

neurondeep: ive moved that internally


GPU MODE ▷ #torch (3 messages):

torch + cuda 12.9, RTX PRO 6000, pytorch docker images, torch unique_consecutive


GPU MODE ▷ #jobs (2 messages):

Performance Engineer Hiring, High Compensation


GPU MODE ▷ #torchao (1 messages):

walrus_23: Made a little documentation update PR: https://github.com/pytorch/ao/pull/3480


GPU MODE ▷ #rocm (15 messages🔥):

AMD GPU P2P Copies, ROCm Iris, Symmetric Memory, Finegrained Memory, Torch Sym Mem


GPU MODE ▷ #self-promotion (2 messages):

Per Layer Quantization Benchmarks, 4Bit-Forge Project, Building Autonomous AI Agents with Claude Agent SDK


GPU MODE ▷ #gpu模式 (1 messages):

Triton.jit, Flash attention kernel, keyword argument step in triton.jit


GPU MODE ▷ #submissions (18 messages🔥):

nvfp4_gemm leaderboard, NVIDIA performance, Personal best submissions


GPU MODE ▷ #multi-gpu (1 messages):

nsys dumps, collective launch skew, nccl-skew-analyzer


GPU MODE ▷ #helion (2 messages):

Random Number Generation Issue, Helion Issues


GPU MODE ▷ #nvidia-competition (2 messages):

Discord Bot Error, Benchmark Submissions


HuggingFace ▷ #general (43 messages🔥):

TTS models on Hugging Face, AI weekly meetings/conferences/talks, NVidia GeForce 5090 bug report, Lightweight vision transformer (ViT) models, Dataset Viewer error


HuggingFace ▷ #today-im-learning (2 messages):

Generative Models, RAG systems, GANs, VAEs, Transformers


HuggingFace ▷ #i-made-this (5 messages):

WebGPU AI Voice Chat, GLM-ASR Model, Lucy AI Companion App, Superintelligence: Distributed Relational Cognition


Yannick Kilcher ▷ #general (22 messages🔥):

AI spam, RL for learning efficiency, DL theory changes, AI CV spam


Yannick Kilcher ▷ #ml-news (7 messages):

Mistral Vibe, RealVideo by Neoneye, GPT-5.2, Polynoamial Tweet


Latent Space ▷ #ai-general-chat (19 messages🔥):

AI Weekly Meetings, Latent Space Resources, GPT-5 Age Verification, Sam Altman's cryptic Tweet


Latent Space ▷ #genmedia-creative-ai (4 messages):

anvishapai's Twitter Status, X-Ware.v0


Moonshot AI (Kimi K-2) ▷ #general-chat (17 messages🔥):

Qwen Code, Kimi Search Feature, Kimi K2 Free, Mistral subscription, Chinese Century


Manus.im Discord ▷ #general (7 messages):

Free Website for Vedios, Manus AI Failure, Incorrect charge for a plan upgrade, WordPress plugin using Manus


tinygrad (George Hotz) ▷ #general (2 messages):

tinygrad AMD support, AMD AI Sphere, tinycorp drivers


tinygrad (George Hotz) ▷ #learn-tinygrad (3 messages):

swizzling for tensor core, amd_uop_matmul style


aider (Paul Gauthier) ▷ #general (4 messages):

Claude Sonnet 3.7 quality degradation, Edit difficulty with larger models


MCP Contributors (Official) ▷ #mcp-dev-summit (3 messages):

MCP Dev Summit, Linux Foundation


MCP Contributors (Official) ▷ #general (1 messages):

hilocalden: Definitely not me 😁 I am just sharing the announcement.


DSPy ▷ #general (3 messages):

DSPy and OpenAI, Custom Adapters, User/Assistant message exchanges


MLOps @Chipro ▷ #events (1 messages):

Diffusion Models, Transformers, Study Group, Free Intro Workshops, Flow Matching


MLOps @Chipro ▷ #general-ml (1 messages):

AI API Integration Platform, Model Aggregation, Developer Discounts


Windsurf ▷ #announcements (2 messages):

Windsurf 1.12.41 Release, Windsurf 1.12.160 Release, Windsurf MCP Management UI, Windsurf GitHub/GitLab MCP Fixes, Windsurf Diff Zones Improvements