Frozen AI News archive

not much happened today

**Moonshot AI's Kimi K2 Thinking** AMA revealed a hybrid attention stack using **KDA + NoPE MLA** outperforming full MLA + RoPE, with the **Muon optimizer** scaling to ~1T parameters and native **INT4** QAT for cost-efficient inference. K2 Thinking ranks highly on **LisanBench** and **LM Arena Text** leaderboards, offering low-cost INT4 serving and strong performance in Math, Coding, and Creative Writing. It supports heavy agentic tool use with up to 300 tool requests per run and recommends using the official API for reliable long-trace inference. **Meta AI** released the **Omnilingual ASR** suite covering 1600+ languages including 500 underserved, plus a 7B wav2vec 2.0 model and ASR corpus. Additionally, the **Gelato-30B-A3B** model for computer grounding in GUI manipulation agents outperforms larger VLMs, targeting immediate agent gains. Qwen's image-edit LoRAs and light-restoration app were also highlighted.

Canonical issue URL

a quiet day

AI News for 11/7/2025-11/10/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (201 channels, and 12566 messages) for you. Estimated reading time saved (at 200wpm): 1015 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

The Kimi K2 AMA is getting a lot of buzz.


AI Twitter Recap

Moonshot AI’s Kimi K2 Thinking: AMA takeaways, evals, INT4 design, and upcoming vision

Speech and Computer-Use Models: Meta’s Omnilingual ASR and Gelato-30B-A3B

Data and Pretraining: Synthetic data, curriculum, and eval design

Scaling Infra: GPUs, kernels, and giga‑scale data centers

Agents, auth, and evaluation tooling

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Strix Halo Networking Performance Analysis

2. Qwen3-VL OCR Capabilities and Comparisons

3. BERT Chatbot with dLLM

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. China's AI Advancements and Rivalry

2. Humorous AI Critiques and Memes

3. AI in Politics and Economics


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

The Kimi K2 Uprising and Anticipation for the Next Generation

Kernel Wizards and Hardware Hackers Push Performance Limits

Developer Platforms Suffer Death by a Thousand Cuts

Taming Model Quirks, From Censorship to Continual Learning

Open Source Projects Power Forward with New Tools and Workflows


Discord: High level Discord summaries

LMArena Discord


Perplexity AI Discord


LM Studio Discord


Cursor Community Discord


HuggingFace Discord


GPU MODE Discord


OpenRouter Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


Nous Research AI Discord


Moonshot AI (Kimi K-2) Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


Latent Space Discord


Eleuther Discord


tinygrad (George Hotz) Discord


DSPy Discord


aider (Paul Gauthier) Discord


MCP Contributors (Official) Discord


Manus.im Discord Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1385 messages🔥🔥🔥):

Sora 2 Pro, OpenAI Rules, Billionaire Thief, Gemini 3, Nano Banana 2


LMArena ▷ #announcements (3 messages):

Image Edit Leaderboard, Abstract Art Contest Winner, Text Leaderboard Update, Kimi-k2-thinking model


Perplexity AI ▷ #general (1132 messages🔥🔥🔥):

Comet Browser Issues, Ad Blocking on YouTube, Perplexity Referral Program Troubles, Context Window Limits, Perplexity Pro value


Perplexity AI ▷ #sharing (3 messages):

The Orbits Debut Single, Shareable Threads on Perplexity AI


Perplexity AI ▷ #pplx-api (5 messages):

Perplexity Pro credits, API key generation, Credits rollover


LM Studio ▷ #general (497 messages🔥🔥🔥):

Gemma caching, Qwen VL working, Writing-style LLM


LM Studio ▷ #hardware-discussion (662 messages🔥🔥🔥):

3090 Performance, GPU Cooling, Multi-GPU Setups, AMD vs Nvidia, LLM Performance


Cursor Community ▷ #general (892 messages🔥🔥🔥):

Sonnet 4.5 Pricing, Composor-1 Issues, Cursor Student Verification, OpenRouter Integration, Cursor Crashing


Cursor Community ▷ #background-agents (5 messages):

Auto model with cloud agents, Environment.json dependencies, Composer-1 suggestion


HuggingFace ▷ #general (714 messages🔥🔥🔥):

Ambidextrous AI minds, Reasoning Traces for LLMs, Language Compression with AI, Systems for AI, Hugging Face Spaces


HuggingFace ▷ #today-im-learning (1 messages):

Attention, Self-attention, Masked self-attention, Multi-head attention, Position encoding


HuggingFace ▷ #cool-finds (2 messages):

ComfyUI Workflows, Open Source Voice AI


HuggingFace ▷ #i-made-this (24 messages🔥):

MU/TH/ER demo update, Qwen 3 1.7b quant 4 fp16, Kokoro82M for TTS, FRAI on Product Hunt, Open source AI interface for Rust coding


HuggingFace ▷ #NLP (1 messages):

PII anonymisation, LLM agents, DataTune tool


HuggingFace ▷ #smol-course (1 messages):

smol-course, SFT unit, Markdown instructions


HuggingFace ▷ #agents-course (22 messages🔥):

Agents Course Prerequisites, Unit 4 Assessment Issues, API File Access Problems


GPU MODE ▷ #general (25 messages🔥):

group norm vs instance norm vs layer norm, Nvidia Promotion Codes, CUDA from scratch in python, POSITS for number system


GPU MODE ▷ #triton-gluon (14 messages🔥):

Triton and Gluon kernel writing, Autodiff for Triton kernels, Efficient backward kernels generation, Shared memory size in Triton


GPU MODE ▷ #cuda (12 messages🔥):

INT8xINT8 GEMM CUDA kernels, Nsight copilot crashing, MMA vs WGMA performance, ldmatrix performance, Ampere GEMM Tricks


GPU MODE ▷ #torch (19 messages🔥):

PyTorch Numerics, MPS environment variables, GPU acceleration


GPU MODE ▷ #cool-links (13 messages🔥):

Consumer Blackwell, Data-center Blackwell, Microbenchmarking, GB200


GPU MODE ▷ #jobs (3 messages):

ScienceCorp openings, Mercor contract roles, Amazon MLE positions


GPU MODE ▷ #beginner (19 messages🔥):

one-letter variables, kernel readability, SYCL, CUDA courses, accelerated-computing.academy


GPU MODE ▷ #torchao (3 messages):

Quantization Libraries, Float8 Weight, GEMM Kernel, CUDA OOM


GPU MODE ▷ #intel (3 messages):

Intel GPU Memory Bank Conflicts, CuTe Swizzling on Intel GPUs, Gen Architecture L1$/SLM Banking


GPU MODE ▷ #self-promotion (7 messages):

CUTLASS learning, Matmuls/GEMMs hacking, Simon Boehm blog post reproduction, fp16 and bf16 kernels, Tensorcores, WMMA, Swizzling, Pipelining, and Autotuning


GPU MODE ▷ #🍿 (1 messages):

tbert3971: This is great, is there anyway I can help with your effort?


GPU MODE ▷ #reasoning-gym (5 messages):

wandb logs, VERL


GPU MODE ▷ #submissions (114 messages🔥🔥):

grayscale_v2 leaderboard, vectoradd_v2 leaderboard, vectorsum_v2 leaderboard, histogram_v2 leaderboard, nvfp4_gemv leaderboard


GPU MODE ▷ #status (2 messages):

nvfp4_gemv, Profiling Traces


GPU MODE ▷ #hardware (16 messages🔥):

DGX Spark vs Strix Halo, A100 Performance, TechPowerUp Specs Inaccuracy


GPU MODE ▷ #cutlass (19 messages🔥):

cutedsl gotchas, dynamic vs static values in cutedsl, constexpr values in cute.jit(), tiled MMA in cutedsl


GPU MODE ▷ #singularity-systems (2 messages):

picograd commits, tinygrad abstractions


GPU MODE ▷ #general (12 messages🔥):

Inline CUDA, Triton, Popcorn CLI, VecAdd_V2 and FP4, CuTe DSL


GPU MODE ▷ #multi-gpu (12 messages🔥):

multi-node communication performance, NVSHMEM, LLM inference, nvshmem4py, low-latency communication kernels


GPU MODE ▷ #helion (70 messages🔥🔥):

Helion vs Triton Performance, Attention Kernel Performance, Subtiling Autotuning, Persistent Kernels, CUDA Graphs


GPU MODE ▷ #nvidia-competition (366 messages🔥🔥):

L2 kernel, measurement noise, burn in, Cutlass upgrade, CUDA versions


GPU MODE ▷ #xpfactory-vla (11 messages🔥):

Vision Language Action models (VLAs), Robotic foundation models, Data flywheels, VLAs and LRMs, LIBERO & RoboTwin


OpenRouter ▷ #announcements (1 messages):

Kimi K2, Crashloop, Issue Resolution


OpenRouter ▷ #app-showcase (7 messages):

Orchid AI Assistant, Release Date Estimation, The nature of work


OpenRouter ▷ #general (569 messages🔥🔥🔥):

OpenRouter video support, Polaris Alpha mini model, OpenAI adult content handling, Kimi K2 leaderboard rankings, Gemini 2.5 token usage


OpenRouter ▷ #new-models (2 messages):

``


OpenRouter ▷ #discussion (29 messages🔥):

OpenRouter Model Node on n8n, OR Show Technical Segment, GPT-4 Regression, Chatroom Memory Setting, Automated Capabilities Scanning


OpenAI ▷ #ai-discussions (515 messages🔥🔥🔥):

Sora nerfed, GPT-5.1 release, AI Censorship, Gemini 3 vs GPT 5, OpenAI


OpenAI ▷ #gpt-4-discussions (20 messages🔥):

GPT memory mixing, Image upload errors, GPT-5 Rerouting, File Creation Failures, Email Task Freaking


OpenAI ▷ #prompt-engineering (26 messages🔥):

Instagram carousel images, Video Enhancement, Prompt Engineering Courses, Assistant API deprecation, System prompt control


OpenAI ▷ #api-discussions (26 messages🔥):

Image generation with ChatGPT, Video enhancement, Prompt engineering courses, Assistance API deprecation, System prompt control


Unsloth AI (Daniel Han) ▷ #general (296 messages🔥🔥):

AgentRL with Qwen, UD Quants vs Non-UD, Muon Optimizer Support, Granite 4.0 4-bit Model Issues, Kimi K2 Thinking GGUF


Unsloth AI (Daniel Han) ▷ #introduce-yourself (7 messages):

Introductions, AI Engineer, Data Scientist, Full-Stack Developer, AI generated profiles


Unsloth AI (Daniel Han) ▷ #off-topic (99 messages🔥🔥):

GDDR7 Pricing Impact, Levenshtein Distance, Data Refinement Issues, Gemini Flirting Bug, Training vs Inference


Unsloth AI (Daniel Han) ▷ #help (92 messages🔥🔥):

GGUF in vllm, Hyperparameter tuning methods, Kimi K2 GGUF reasoning tokens, Quantization scripts for Kimi K2, Unsloth dynamic quantization


Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

Qwen 3 4b, Unsloth


Unsloth AI (Daniel Han) ▷ #research (15 messages🔥):

PII Detection, Roblox Filters, Llama3 Benchmarks, Code Evaluation Harness


Nous Research AI ▷ #general (341 messages🔥🔥):

Kimi vs ChatGPT, Deepseek, GLM Pricing, Vulkan ML Library, Hermes Optimus Project


Nous Research AI ▷ #ask-about-llms (10 messages🔥):

AI Hallucinations, Coding Agents, wh-falsify Repo, Civil Disagreement


Nous Research AI ▷ #research-papers (3 messages):

Nested Learning, Continual Learning


Nous Research AI ▷ #research-papers (3 messages):

Nested Learning, Continual Learning


Moonshot AI (Kimi K-2) ▷ #general-chat (269 messages🔥🔥):

Kimi K2 model vs GLM 4.6, Unsloth team issue with Kimi-K2-Thinking model, Kimi for coding limitations, Kimi CLI reviews, Student discount for Kimi


Modular (Mojo 🔥) ▷ #general (169 messages🔥🔥):

Mojo vs Rust error handling, Modular's business model, Mojo package ecosystem growth, Mojo's appeal to Python and Rust developers, Mojo's future language paradigms


Modular (Mojo 🔥) ▷ #mojo (42 messages🔥):

libnuma and gigantic pages, Mojo for HPC, User-defined dialects, variant vs rust enum, Rust stdlib in Rust


Modular (Mojo 🔥) ▷ #max (1 messages):

Modular's data handling on GPU, PCle Bottleneck Discussion


Yannick Kilcher ▷ #general (151 messages🔥🔥):

Qwen3-VL, Ollama, Extropic, Political tensions in open source projects, LLM coding issues


Yannick Kilcher ▷ #paper-discussion (12 messages🔥):

Nested Learning, Continual Learning, TreeQuest, camera-ready NeurIPS


Yannick Kilcher ▷ #agents (40 messages🔥):

HRM/TRM/RWKV, self-steering programs, Adaptive Resonance Theory (ART), DDVFA (Distributed Dual Vigilance Fuzzy ART)


Yannick Kilcher ▷ #ml-news (4 messages):

Compute per Country, Post-Industrial Roman Republic


Latent Space ▷ #ai-general-chat (77 messages🔥🔥):

Sequoia move, Terminal Bench 2.0, Kimi K2 vs GPT-5, EdgeTAM for iPhone 15, Nested Learning by Google


Eleuther ▷ #general (28 messages🔥):

Weights and Biases (WandB), Weave Evals, NeurIPS, GPU Cloud Providers, ML bio event at NeurIPS


Eleuther ▷ #research (28 messages🔥):

QAT vs PTQ, Overfitting Autoencoders, Straight Through Estimator, Noise Injection for Transformers


Eleuther ▷ #interpretability-general (8 messages🔥):

Anthropic Mechanistic Interpretability, SAE Issues, Nonlinear Feature Relationships in LLMs, Reading group launch


tinygrad (George Hotz) ▷ #general (13 messages🔥):

4090 vs 5090, Mutation testing, pyproject.toml switch, Hatch vs Setuptools, Custom backward function with custom kernel


tinygrad (George Hotz) ▷ #learn-tinygrad (49 messages🔥):

UOps.after Restrictions, CUDA Reduction in tinygrad, Tensor.from_blob on MPS Devices, Style Transfer in tinygrad


DSPy ▷ #show-and-tell (1 messages):

DSPy Planner, Multi-Agent Tool, Orchestrator


DSPy ▷ #general (52 messages🔥):

DSPy Optimization Issues, TOON Adapter for DSPy, Agent CLI Support with DSPy, DSPy Success Stories, Feedback Text for DSPy Optimization


aider (Paul Gauthier) ▷ #general (17 messages🔥):

Kimi Model Feedback, aider vs agentic coders, aider-ce branch, MoonshotAI Kimi K2


aider (Paul Gauthier) ▷ #questions-and-tips (11 messages🔥):

Prefill Latency, Chunking JSON for Aider, Summarizing JSON for Aider, Token Limits, Figma Designs


MCP Contributors (Official) ▷ #general (9 messages🔥):

2025-11-25 Spec Release, SDK Changes and Review for SEP-1330, Agent Access to Slack and Gsuite APIs, MCP Client Interception of PII Data, Web Summit in Lisbon


Manus.im Discord ▷ #general (5 messages):

VEO3 connection issues, Subscription cancellation due to pricing, Expert engineer introduction