Frozen AI News archive

not much happened today

**vLLM 0.12.0** introduces DeepSeek support, GPU Model Runner V2, and quantization improvements with PyTorch 2.9.0 and CUDA 12.9. **NVIDIA** launches CUDA Tile IR and cuTile Python for advanced GPU tensor operations targeting Blackwell GPUs. **Hugging Face** releases Transformers v5 RC with an any-to-any multimodal pipeline supporting models like **Gemma3n** and **Qwen3-Omni**. Agent platforms see updates from **LangChain** with content moderation and cost tracking, **Together AI** and **Meta AI** collaborate on RL for long-horizon workflows, and **SonarSource** integrates static analysis into AI codegen. Economic insights from **OpenRouter** highlight coding as a key AI application, with reasoning models surpassing 50% usage and market bifurcation between premium and open models. Additionally, **Kling Video 2.6** debuts native audio capabilities, and **Runway Gen-4.5**, **Qwen3-TTS**, and **Gemini 3 Pro** advance multimodality.

Canonical issue URL

a quiet end to NeurIPS.

AI News for 12/4/2025-12/5/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (205 channels, and 10387 messages) for you. Estimated reading time saved (at 200wpm): 681 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Enjoy the new AIE CODE videos rolling out all weekend!


AI Twitter Recap

Reasoning/coding models and inference infra: vLLM 0.12.0, NVIDIA CUDA Tile, Transformers v5, and agent ops

Kling 2.6 native audio, Runway Gen‑4.5, Qwen3‑TTS, and Gemini 3 Pro multimodality

Evals, leaderboards, and agent operations in the wild

Open models, datasets, and tooling

NeurIPS and community highlights

Top tweets (by engagement)

Image generation and editing: FLUX.2 [dev] and LongCat‑Image‑Edit


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. AI in Sports Analytics

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. AI Usage in Workplaces

2. Image Generation and Animation Tools

3. Humorous and Creative Illustrations


AI Discord Recap

A summary of Summaries of Summaries by gpt-5.1

1. Next‑Gen GPU Software: CUDA 13.1, cuTile, and Verified Sparse Attention

2. LLM Benchmarks, Usage Telemetry, and Emerging Model Contenders

3. Tool-Oriented and Cost‑Aware Agent Architectures

4. Hardware Shifts: From TinyCorp GPU Bricks to Legacy NVIDIA Obsolescence

5. Training, Quantization, and Small‑Model Alternatives


Discord: High level Discord summaries

BASI Jailbreaking Discord


LMArena Discord


Perplexity AI Discord


Cursor Community Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


OpenAI Discord


OpenRouter Discord


Eleuther Discord


GPU MODE Discord


Latent Space Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


HuggingFace Discord


Moonshot AI (Kimi K-2) Discord


MCP Contributors (Official) Discord


aider (Paul Gauthier) Discord


DSPy Discord


tinygrad (George Hotz) Discord


Manus.im Discord Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

BASI Jailbreaking ▷ #general (1290 messages🔥🔥🔥):

LMStudio Hub Presets, GPTs & zombies, YouTube Premium ad blockers, Modern Evolutionary Synthesis, Gemini Ethical Skills


BASI Jailbreaking ▷ #jailbreaking (274 messages🔥🔥):

Gemini 3 Pro Jailbreak, Nano Banana Pro jailbreak, DeepSeek Jailbreak, Claude Jailbreak Frustrations, GPT-5.1 Restrictions


BASI Jailbreaking ▷ #redteaming (6 messages):

ZapGPT2 Jailbreaking, SMTP Server Acquisition, Red Team Experience Post-Graduation, Open Source Red Teaming Tools


LMArena ▷ #general (1568 messages🔥🔥🔥):

Hollywood and AI Art, Sora's Limitations, Perchance Unrestricted, AI Generated Images and Realism, Gemini vs other LLMs


LMArena ▷ #announcements (2 messages):

New Models, Contest Reminder


Perplexity AI ▷ #general (1246 messages🔥🔥🔥):

Cloudflare Outage, Perplexity Pro Limits, Gemini Deep Research Comparison, O3 Pro Disappearance, Gemini x CR7 Feature


Perplexity AI ▷ #pplx-api (4 messages):

Rate Limit Increase for Search API


Cursor Community ▷ #general (998 messages🔥🔥🔥):

Sequoia OS, RAM Usage, Cursor performance degradation, GPT-5 Codex Max vs Opus 4.5, Cursor agent review


Unsloth AI (Daniel Han) ▷ #general (373 messages🔥🔥):

MacOS Docker bug, Gemini 3 Pro hot take, Nvidia's open source contributions, NYC Hackathons, Claude Pro's Language Prowess


Unsloth AI (Daniel Han) ▷ #introduce-yourself (1 messages):

AI Product Development, Systematic problem solving, Tenant Law Service


Unsloth AI (Daniel Han) ▷ #off-topic (273 messages🔥🔥):

Human vs AI content, GPU and RAM prices, RP (Role Play) business, Gemini music discovery, monitor reviews


Unsloth AI (Daniel Han) ▷ #help (41 messages🔥):

Unsloth installation issues, WSL2 setup for Windows, Gradient Accumulation Speed Tradeoff, Ollama compatibility, GGUF Quantization and export scripts


Unsloth AI (Daniel Han) ▷ #research (2 messages):

arXiv endorsement, EleutherAI


LM Studio ▷ #general (166 messages🔥🔥):

AI finetuning AI, Speculative token trading, Gemini 3.0 promoting LM Studio, Qwen 3, Trainer Plugin


LM Studio ▷ #hardware-discussion (367 messages🔥🔥):

Triple GPU bugginess, Thermaltake AIO failures, Nvidia Driver support, MI50 quirks, Thunderbolt/USB PCIe adapters for GPUs


OpenAI ▷ #ai-discussions (319 messages🔥🔥):

Gemini 3 vs Opus 4.5 SWE-Bench, Google's Spending on Gemini 3, Gemini 3 Glazing, ChatGPT's Leanings, GPT-5.1 and bug finding


OpenAI ▷ #gpt-4-discussions (5 messages):

ChatGPT chat history, Cross-chat memory, Model awareness, Long chat management


OpenAI ▷ #prompt-engineering (7 messages):

AI ecosystem directionality, GPT-5.1 posture persistence, Gemini style stability, Isekai engine prompt


OpenAI ▷ #api-discussions (7 messages):

AI ecosystem directionality, Prompt engineering, Posture Persistence Experiment (GPT-5.1 vs Claude vs Gemini), Long-horizon style persistence, Gemini's style and posture


OpenRouter ▷ #announcements (3 messages):

State of AI report, LLM insights on OpenRouter, FLUX.2 chat with Robin Rombach


OpenRouter ▷ #general (238 messages🔥🔥):

Claude CODEX MAX vs OPUS, finish_reason null meaning, OpenAI API data deletion, Roleplay statistics on OpenRouter, Qwen 4B uptime issues


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (54 messages🔥):

LLM-generated announcements, AI 'Charlie' confusion, Vitest recommendation, Image model comparisons, Chatroom unreliability


Eleuther ▷ #general (43 messages🔥):

Small LM Training, Benchmark Recommendations, HuggingFace LM Training Playbook, Ultra Small Google Model, LoRA with Regret


Eleuther ▷ #research (54 messages🔥):

Attention Sinks, Adam vs Signed Momentum, Gated Attention, synthetic dataset, neural race and generalization


Eleuther ▷ #interpretability-general (7 messages):

4D physics engine, General AI vs LLMs, Signal analysis approach to AI, Air gapped AI system, Feature manifolds in CNNs


GPU MODE ▷ #general (4 messages):

Async RL MLsys papers, Factorio learning environment, Paperclip maximization study


GPU MODE ▷ #cuda (17 messages🔥):

cuTile release, tileIR and PTX relationship, CUDA programming guide rewrite, cuTile's mxfp/nvfp support, TileIR vs Triton IR


GPU MODE ▷ #cool-links (7 messages):

Sparse Attention Adoption, VATTENTION: Verified Sparse Attention, CUDA-L2 performance


GPU MODE ▷ #pmpp-book (6 messages):

PMPP Book, Wen-mei autograph, GTC next year, CUDA reading


GPU MODE ▷ #rocm (4 messages):

Strix Halo laptop, RDNA 3.5 vs RDNA 4, CDNA 4 architecture, HIPKittens kernels


GPU MODE ▷ #metal (1 messages):

smexy3: Which inference framework is the best if you want to use Multiple Mac Studios connected?


GPU MODE ▷ #self-promotion (4 messages):

Quantization of Large Language Models, MoE-Quant, GPTQ, CUDA 13.1, CUDA Tile


GPU MODE ▷ #submissions (11 messages🔥):

NVIDIA leaderboard updates, nvfp4_gemm performance improvements, vectoradd_v2 leaderboard entry


GPU MODE ▷ #factorio-learning-env (3 messages):

NeurIPS, LFG


GPU MODE ▷ #teenygrad (2 messages):

Peephole Optimization, Movement Opcodes, CUDA/HIP Runtimes, LazyBuffer


GPU MODE ▷ #general (2 messages):

Kernel Development, achievement


GPU MODE ▷ #nvidia-competition (13 messages🔥):

RL Cheating, Blackwell GPU Access, Modal for Development, Subprocess Communication with Shared Memory, B200 GPU for Benchmarks


Latent Space ▷ #ai-general-chat (73 messages🔥🔥):

SQL Injections by Claude, Vibe Coding, Tanstack AI, Limitless acquired by Meta, Qwen 1.5-110B MoE Parity


Modular (Mojo 🔥) ▷ #general (1 messages):

inaarawalji_23: going live today 🙂


Modular (Mojo 🔥) ▷ #announcements (2 messages):

MAX Framework, Model API Updates, Modular Meetup


Modular (Mojo 🔥) ▷ #mojo (41 messages🔥):

Gemini 3 Mojo Understanding, Mojo stdlib Proposal, Mojo GPU Setup, Mojo Lifetimes Bug, Mojo Open Source Release


Yannick Kilcher ▷ #general (31 messages🔥):

DeepSeek Technical, Linear Control Theory, AI Competition Catastrophe, Robustness vs Performance in Control, Unknown Dynamics in Robotics


Yannick Kilcher ▷ #ml-news (9 messages🔥):

Bezos AI company, private video, arcprize withdrawn


HuggingFace ▷ #general (24 messages🔥):

DeepSeek v3.2 Transformers Implementation, Z Image Censorship, Hugging Face Space CPU Quota, Small LLM for Roblox, AI-Generated Music YouTube Channel


HuggingFace ▷ #cool-finds (1 messages):

Inefficiency of Large Language Models, HRM and TRM models as alternatives, Compute Cost of LLMs, Environmental Impact of LLMs, Rising Costs Due to LLMs


HuggingFace ▷ #i-made-this (3 messages):

Anthropic Programmatic Tool Calling, Universal Programmatic Tool Calling, Model Agnostic Tool Orchestrator, Rhai Scripts for LLMs, Token Reduction in LLMs


HuggingFace ▷ #smol-course (1 messages):

sky.moo: https://huggingface.co/blog/hf-skills-training


HuggingFace ▷ #agents-course (2 messages):

Agent Course Certificate


Moonshot AI (Kimi K-2) ▷ #general-chat (24 messages🔥):

Kimi for Coding Access, corporate policy reasons, LM Playground, 4x K2 turbo limit


MCP Contributors (Official) ▷ #general-wg (9 messages🔥):

MCP Tokens, Tokenization, tiktoken, Claude 3 tokenizer


aider (Paul Gauthier) ▷ #general (7 messages):

Ollama Timeout Errors, Claude Sonnet 4.5 Downgrade, Workflow Automation Engineer Introduction


aider (Paul Gauthier) ▷ #questions-and-tips (1 messages):

aider on local LLMs, aider on Android, Cross-device coding with aider


DSPy ▷ #show-and-tell (1 messages):

justanotheratom: https://x.com/realsanketp/status/1996978356227920345?s=20


DSPy ▷ #general (4 messages):

DSPy support to Claude agents, GRPO algorithm in DSPy, Multi-turn conversations


tinygrad (George Hotz) ▷ #general (4 messages):

FSDP in tinygrad bounty, USBGPU on Raspberry Pi, USB transactions


tinygrad (George Hotz) ▷ #learn-tinygrad (1 messages):

struct.unpack GPU Implementation, tinygrad GPU Unpacking


Manus.im Discord ▷ #general (1 messages):

Workflow Automation, LLM Integration, RAG Pipelines, AI Content Detection, Image AI