Frozen AI News archive

not much happened today

**Kimi-K2 Reasoner** has been integrated into **vLLM** and will soon be supported by **SGLang**, featuring a massive **1.2 trillion parameter MoE** configuration. **Perplexity AI** released research on cloud-portable trillion-parameter MoE kernels optimized for **AWS EFA**, with potential integration into **vLLM**. **IBM's vLLM** team formalized hybrid dense and sparse expert models, supporting models like **Qwen3-Next**, **Nemotron Nano 2**, and **Granite 4.0**. **Kimi-K2** reportedly scores **77% on GPQA Diamond**, outperforming **GPT-4.5** at 71.4%, though this is unverified.

Canonical issue URL

a quiet day.

AI News for 11/4/2025-11/5/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (200 channels, and 6597 messages) for you. Estimated reading time saved (at 200wpm): 566 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Gemini 3 and GPT 5.x can't come soon enough...


AI Twitter Recap

Kimi-K2 lands in open inference stacks; Perplexity unlocks trillion-param MoE kernels

Agent systems, MCP, and coding stacks get more “production”

Multimodal and video: subject consistency, real-time generation, controllability

Research and training notes

Ecosystem and platform moves

Top tweets (by engagement)

Notes and miscellany


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Qwen Model Usability Issues

2. Local AI Hardware Setup Insights

3. Anticipation for GLM 4.6 AIR Release

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. XPENG Humanoid Robot Developments

2. Gemini 3 and Google AI Integrations

3. AI Art and Film Innovations


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. Developer Tooling & Compute Launches

2. New Benchmarks, Datasets & Safety Models

3. GPU Kernel Engineering: FP8, Bandwidth & Fixes

4. API Reliability & Model Routing Woes

5. Ecosystem Moves, Cloud Costs & Hiring


Discord: High level Discord summaries

LM Studio Discord


LMArena Discord


Perplexity AI Discord


Cursor Community Discord


Unsloth AI (Daniel Han) Discord


GPU MODE Discord


HuggingFace Discord


OpenAI Discord


Nous Research AI Discord


tinygrad (George Hotz) Discord


Yannick Kilcher Discord


DSPy Discord


Moonshot AI (Kimi K-2) Discord


aider (Paul Gauthier) Discord


MCP Contributors (Official) Discord


Eleuther Discord


Manus.im Discord Discord


Windsurf Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LM Studio ▷ #announcements (1 messages):

VLM OCR, Flash Attention, lms runtime command, MiniMax-M2 tool calling, macOS 26 compatibility


LM Studio ▷ #general (427 messages🔥🔥🔥):

Local AI Coding Models, Renting Servers for Personal Models, Disabling Menu Bar in AppImage, Intel vs AMD for LLM Use, Combining RTX 5090 and 5070Ti


LM Studio ▷ #hardware-discussion (911 messages🔥🔥🔥):

GPU Cooling Solutions, PCIE Bifurcation Discussions, Overclocking Motherboards, RAM configurations, GPU configurations


LMArena ▷ #general (1121 messages🔥🔥🔥):

OpenAI deals, Google AI future, LMArena trust, Sora 2 credits, Claude 3.5 sonnet


LMArena ▷ #announcements (1 messages):

Arena Expert Leaderboard, Occupational Leaderboards, Expert Prompt Dataset


Perplexity AI ▷ #general (947 messages🔥🔥🔥):

Comet referral program, Perplexity Pro chat history, Student subscription on active operator sub, Ethernet cable recommendations, Airtel subscription offer


Perplexity AI ▷ #sharing (2 messages):

Spotify Song Sharing, Shareable Threads


Perplexity AI ▷ #pplx-api (3 messages):

Perplexity Pro, Tool calling errors with sonar-pro, Tool calling with Perplexity API


Cursor Community ▷ #general (554 messages🔥🔥🔥):

Tailwind 4, Nuxt 4, Phantom wallets, exoudos wallets, Specstory extension


Cursor Community ▷ #background-agents (9 messages🔥):

Mobile Web UI Crashing, Background Agents Use Case, Bug with images in prompts, Diff Rendering Improvements, API Endpoints for Cloud Agent


Unsloth AI (Daniel Han) ▷ #general (203 messages🔥🔥):

DeepSeek-OCR notebook, Q4 Ks-XL vs IQ4, TRL notebooks vs Unsloth notebooks, fine tune a cross encoder or embedding model, Qwen3VL-30B-A3B on 16GB vram


Unsloth AI (Daniel Han) ▷ #introduce-yourself (4 messages):

Blockchain, AI, Trust in Code, Consensus Mechanisms


Unsloth AI (Daniel Han) ▷ #off-topic (196 messages🔥🔥):

SFT as RL, Quantum Randomness, Manual Data Entry, Nvidia Blackwell Pro 4500 vs 5090, ECC Memory Value


Unsloth AI (Daniel Han) ▷ #help (79 messages🔥🔥):

MiniMax M2 local inference, GPT-OSS-20B training with Unsloth + REINFORCE, System prompt usage during finetuning, Multilingual reranker with GGUF and llama.cpp, Granite 4.0 Hybrid 4-bit conversion issues


Unsloth AI (Daniel Han) ▷ #research (9 messages🔥):

Roblox PII Classifier, Open Sourcing Safety Tools, PII Dataset Access


GPU MODE ▷ #general (52 messages🔥):

Youtube lectures, B770 GPU, 3080 modding, Auto Vectorization Compiler Blogpost


GPU MODE ▷ #triton-gluon (1 messages):

Community Meetup, Timezones, Meeting Details


GPU MODE ▷ #cuda (5 messages):

Memory-Bound Matmuls, SM count impact on Matmul latency, Saturating Memory Bandwidth on GPUs, Little's Law relevance to GPU performance


GPU MODE ▷ #torch (19 messages🔥):

torch.compile CUDA graph recapture, torch.compile grouped_mm UserWarning, vLLM pytorch dependencies, Float8 tensors limitations, custom kernel opcheck failure


GPU MODE ▷ #cool-links (1 messages):

marksaroufim: https://xillybus.com/tutorials/pci-express-tlp-pcie-primer-tutorial-guide-1


GPU MODE ▷ #jobs (1 messages):

Mixlayer, AI inference platform, Rust, CUDA, founding engineer


GPU MODE ▷ #beginner (16 messages🔥):

RL bug on accumulator type fixed at fp32, Practice Deshourading, Hackathon, PyTorch/vllm on AMD AI PCs


GPU MODE ▷ #jax-pallas-mosaic (1 messages):

Mosaic-GPU, all-gather-matmul


GPU MODE ▷ #torchao (7 messages):

OSS Contribution, fbgemm kernels, fp8 weight-only pattern & torch.compile


GPU MODE ▷ #self-promotion (3 messages):

Disaggregated Inference Retrospective, Symbolica AI Hackathon


GPU MODE ▷ #submissions (11 messages🔥):

vectoradd_v2, grayscale_v2, vectorsum_v2, A100, B200


GPU MODE ▷ #factorio-learning-env (5 messages):

Factorio RCON, Factorio Hidden Settings, Factorio multiplayer server


GPU MODE ▷ #amd-competition (30 messages🔥):

Node allocation, Solution sharing, Competition submissions visibility, Ranking issues


GPU MODE ▷ #cutlass (4 messages):

CuTeDSL Resources, CuTe copy threads, CuTeDSL Sum Reduction Kernel


GPU MODE ▷ #mojo (4 messages):

Mojo GPU Puzzles, Layout API in Mojo, Mojo version compatibility


GPU MODE ▷ #singularity-systems (8 messages🔥):

picograd commits, fuzzing against np,torch, and tinygrad, pedagogical progression, kernels vs compilers


GPU MODE ▷ #general (3 messages):

CUDA, Triton


GPU MODE ▷ #low-bit-training (11 messages🔥):

DeepSeek FP8, Cutlass FP8 GEMM, FP8 Blockwise Training, Benchmarking Scripts, Blockwise Quantization Kernels


GPU MODE ▷ #opencl-vulkan (6 messages):

GLSL Vulkan Compute Shaders, clspv, Clvk, Slang shading language


GPU MODE ▷ #cluster-management (1 messages):

Ansible Scripts, Configuration Management


GPU MODE ▷ #helion (4 messages):

inline_triton, Helion Compiler, atomic_cas, output_like=None, Helion API


GPU MODE ▷ #nvidia-competition (85 messages🔥🔥):

NVFP4 Contest, Mojo Support, Blackwell GPUs, TileLang and CuTeDSL, CUDA learning


GPU MODE ▷ #hf-kernels (3 messages):

Xenova.com, HF Kernels Updates


HuggingFace ▷ #announcements (1 messages):

Sentence Transformers joins Hugging Face, huggingface_hub v1.0, LeRobot v0.4.0, Cleaner Collection URLs, Inference Providers Usage Breakdown


HuggingFace ▷ #general (189 messages🔥🔥):

Home Lab AI Setup, Vex-Math-L1-100K Dataset, LLMs in Stock Data Prediction


HuggingFace ▷ #today-im-learning (3 messages):

Job Application Automation Project, BERT Model Training, SetFit Contrastive Classifier, ArXiv Gatekeeping


HuggingFace ▷ #i-made-this (18 messages🔥):

Model on 40-50b parameters, PDF2TXT parser, DearDiary.jl, NexusAI Professional Suite v1.0


HuggingFace ▷ #NLP (1 messages):

NLP Data Cleaning


HuggingFace ▷ #agents-course (7 messages):

File retrieval issues, Study group formation


OpenAI ▷ #annnouncements (3 messages):

Sora App Android, IndQA Benchmark, Interrupt Long Queries


OpenAI ▷ #ai-discussions (164 messages🔥🔥):

Sora 2 code, OpenAI's photo generator, Custom GPT photo upload issues, Sora offline?, Models acting weird


OpenAI ▷ #gpt-4-discussions (13 messages🔥):

Thinking model degraded, Building ChatGPT apps, OpenAI model comparisons


OpenAI ▷ #prompt-engineering (14 messages🔥):

Prompt Engineering Jobs, GPT Pro Research, Prompt Engineering Tips, Sora 2 Nerf


OpenAI ▷ #api-discussions (14 messages🔥):

Prompt Engineering Job Market, Prompting GPT Pro for Research, Tips for New Prompt Engineers, Sora 2 Nerf


Nous Research AI ▷ #general (56 messages🔥🔥):

Anthropic's closed-source approach, Piracy for Media Preservation, IMO Gold AI, llama.cpp contribution


Nous Research AI ▷ #ask-about-llms (15 messages🔥):

Gestural Interfaces, Repligate's Loom, Attention is All You Need, Vision Pitching


Nous Research AI ▷ #interesting-links (1 messages):

real.azure: https://github.com/ggml-org/llama.cpp/discussions/16938


tinygrad (George Hotz) ▷ #announcements (1 messages):

tinybox pro v2, 5090, rackable workstation


tinygrad (George Hotz) ▷ #general (68 messages🔥🔥):

VK_KHR_buffer_device_address GLSL extension, Tinybox Pro V2, AMD vs Nvidia, GLSL Renderer Implementations, Tensor Cores on M1


Yannick Kilcher ▷ #general (12 messages🔥):

OU Processes Limitations, ito Diffusions Universality, Paper Dumps Overload, Trending AI Papers


Yannick Kilcher ▷ #paper-discussion (21 messages🔥):

Anthropic's crosscoder, circuit tracing research, feature evolution during pre-training, leakages from shared chats, latent reasoning


Yannick Kilcher ▷ #agents (2 messages):

RWKV, HRM/TRM, Context Windows, State Representations


Yannick Kilcher ▷ #ml-news (11 messages🔥):

Concentration of power, Erosion of democratic systems, Copyright lawsuit, Getty vs Stability


DSPy ▷ #general (35 messages🔥):

Pause and resume optimization runs, Accessing the LLM in a DSPy module, Handling Rate Limits with Fallback LLMs, Conversation History Management in DSPy, Pydantic OutputField Deserialization


DSPy ▷ #examples (5 messages):

Synthetic Data Use, Eval Metric, GEPA, Glossary Building


Moonshot AI (Kimi K-2) ▷ #general-chat (24 messages🔥):

Kimi iOS app, interleaved thinking model, Kimi CLI, 401 Error


aider (Paul Gauthier) ▷ #general (8 messages🔥):

Model testing configurations, Using Perplexity API, aider-ce documentation


aider (Paul Gauthier) ▷ #questions-and-tips (9 messages🔥):

TDD with Aider, Memory limitations with Ollama, Claude Code's /compact command, scalafix-rules summarization


MCP Contributors (Official) ▷ #general (17 messages🔥):

IETF 124, MCP discussion in IETF, AI Scrapers


Eleuther ▷ #research (4 messages):

IFeval Scores, Latent Reasoning


Eleuther ▷ #interpretability-general (12 messages🔥):

Concept Detection System, Equivalent Linear Mappings of LLMs, Tangent Model Composition, Jacobian for LLM Interpretability


Manus.im Discord ▷ #general (16 messages🔥):

Text to video tools, Webscraping Twitter/X, Manus Support, Host services for Manus apps, Publishing problems on Manus


Windsurf ▷ #announcements (1 messages):

Codemaps, SWE-1.5, Sonnet 4.5