Frozen AI News archive

Kimi K2‑0905 and Qwen3‑Max preview: two 1T open weights models launched

**Moonshot AI** updated their **Kimi K2-0905** open model with doubled context length to **256k tokens**, improved coding and tool-calling, and integration with agent scaffolds. **Alibaba** released **Qwen 3 Max**, a **1 trillion parameter** model with agent-oriented behavior, available via **Qwen Chat**, **Alibaba Cloud API**, and **OpenRouter**. The community highlights China's dominance in open models and debates around meaningful evaluation methods for code agents, emphasizing long-horizon and domain-specific evals. Influential voices like **@swyx** and **@karpathy** discuss the importance of practical evals and discriminator models for ranking outputs.

Canonical issue URL

Open models are all you need?

AI News for 9/5/2025-9/6/2025. We checked 12 subreddits, 544 Twitters and 22 Discords (186 channels, and 3961 messages) for you. Estimated reading time saved (at 200wpm): 324 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

In July, we last commented on Kimi K2 being the largest SOTA OSS open model to be released, and today Moonshot AI updated their model weights again and released new benchmarks in their paper:

The big new entrant though is Qwen 3 Max, releasing a 1T param model for the first time, obviously beating its smaller siblings. They declined to release hparams, instead calling it "Max", but it still seems that the model weights will be released in short order so it's unclear why exactly they are breaking their own MoE naming schema.

China is overwhelmingly winning the open model war, it seems.


AI Twitter Recap

China’s long‑context coding surge: Kimi K2‑0905 and Qwen3‑Max preview

Evals, agents, and what to measure

Inference and post‑training advances

GPU stacks, kernels, and platforms

OpenAI ecosystem: ChatGPT branching, Responses API, and Codex

Embeddings and retrieval move on‑device (and hit limits)

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Kimi K2-0905 and Qwen 3 Max Launches + Early Demos

2. Open-Source LLMs: GPT-OSS 20B Home Server & Weekly Release Roundup

3. AI/LLM Race Discourse and Meme Reactions

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. OpenAI-Broadcom Chips, Google Veo/Nano Banana, Nunchaku v1.0 Releases

2. AI Robotics: Figure Home Chores and RAI Robomoto

3. AI Society: Inequality, Layoffs, Deepfakes, and Accessibility


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

1. The AI Arms Race: New Models and Hardware Heats Up

2. Geopolitical Jitters and Corporate Policy Shake-Ups

3. The Developer's Dilemma: Choosing and Tuning the Right Tools

4. Under the Hood: The Guts of GPU Programming and Performance

5. User Blues: Platform Instability and UX Woes Create Headaches


Discord: High level Discord summaries

LMArena Discord


Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


Cursor Community Discord


OpenRouter Discord


OpenAI Discord


GPU MODE Discord


Latent Space Discord


DSPy Discord


Moonshot AI (Kimi K-2) Discord


Nous Research AI Discord


Modular (Mojo 🔥) Discord


Eleuther Discord


HuggingFace Discord


tinygrad (George Hotz) Discord


aider (Paul Gauthier) Discord


Yannick Kilcher Discord


Manus.im Discord Discord


LLM Agents (Berkeley MOOC) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1100 messages🔥🔥🔥):

Image generation issues, Video arena bot down, Login requirements, Rate limits, Account data loss


LMArena ▷ #announcements (3 messages):

Video Arena Discord Bot, User Login, Rate Limits


Perplexity AI ▷ #announcements (1 messages):

iOS App Redesign, Comet Access for Students, Comet Shortcuts, Voice Assistant in Comet, GPT-5 Thinking for Pro Users


Perplexity AI ▷ #general (823 messages🔥🔥🔥):

Grok 4 struggles, Qwen 3 Max, Comet Browser, Gemini 2.5 Pro, AI Model Parameter Size


Perplexity AI ▷ #sharing (3 messages):

AMD Zen 6 CPUs, Omarchy Linux, Shareable Threads


Perplexity AI ▷ #pplx-api (4 messages):

API 500 Errors, Playground issues, Outage reporting


Unsloth AI (Daniel Han) ▷ #general (574 messages🔥🔥🔥):

Postgres with pgvector vs. Qdrant, Local Sonnet, DGX Spark vs DGX Station, Qwen 3 Max Evaluation


Unsloth AI (Daniel Han) ▷ #introduce-yourself (3 messages):

Unsloth AI, GPT-OSS, Google Colab T4, Runtime Error


Unsloth AI (Daniel Han) ▷ #off-topic (164 messages🔥🔥):

Super cards release updates, GLM 4.5 Air usable tps, Rover Mows Grass, Deepseek & Qwen Tokenizers are Interchangeable, Mini Kimi K2 MoE models


Unsloth AI (Daniel Han) ▷ #help (40 messages🔥):

Training vs Inference, GPT-OSS finetuning issues, Gemma-3 finetuning errors, Tokenizer Impact on Finetuning, GRPO Support for Gemma3 with vLLM


Unsloth AI (Daniel Han) ▷ #showcase (2 messages):

Glazer model, GPT-4's personality, Ollama, HuggingFace


Unsloth AI (Daniel Han) ▷ #research (11 messages🔥):

Latent Features, Hermes NLP, Financial AI


LM Studio ▷ #general (97 messages🔥🔥):

GPU power draw concerns, Lora Training, Realistic roleplaying model, LM Studio local network setup, Consumer priced Exoskeleton


LM Studio ▷ #hardware-discussion (139 messages🔥🔥):

Frame Generation, Nvidia 5000 series, ATX 3.1 standard, CPU offload vs GPU, Mi50 VRAM quirk


Cursor Community ▷ #general (154 messages🔥🔥):

GPT-5 vs Sonnet 4, Codex CLI vs Cursor Code, Claude Code, Gemini 2.5 Pro, Cursor Pricing


OpenRouter ▷ #announcements (1 messages):

Qwen3-Max, RAG, Tool calling


OpenRouter ▷ #app-showcase (1 messages):

tomlucidor: Finds https://github.com/Lapis0x0/obsidian-next-composer


OpenRouter ▷ #general (126 messages🔥🔥):

OpenRouter Crypto Scam, Anthropic's Geopolitical Concerns, API Key Issues, BYOK Fees, Token Limits and Output Truncation


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (12 messages🔥):

Benchmark Increase, Real World Performance vs. Benchmarks, OpenRouter API usage


OpenAI ▷ #ai-discussions (84 messages🔥🔥):

Multi-Agent Orchestration, Token Efficiency, Gemini 2.5 Pro, Good Luck Token Waste, Carbon Footprint of AGI


OpenAI ▷ #prompt-engineering (10 messages🔥):

Discord chat to Markdown, Prompt engineering lessons, Hierarchical prompting, Abstraction in prompts, ML format matching


OpenAI ▷ #api-discussions (10 messages🔥):

Discord Chat to Markdown, Prompt Engineering Lessons, Hierarchical Communication in Prompts, Abstraction in Prompts, Reinforcement in Prompts


GPU MODE ▷ #general (3 messages):

Anthropic's new policy, Kernel creation solutions


GPU MODE ▷ #triton (2 messages):

Triton, CUDA, GPU, PMPP Book


GPU MODE ▷ #cuda (14 messages🔥):

Barnes-Hut performance, CUDA, Morton code sorting, Octree construction, Memory access optimization


GPU MODE ▷ #torch (13 messages🔥):

fp8 matrix multiplication, tensor cores accumulator, Runtime Triggered Module Loading, vLLM profiling


GPU MODE ▷ #algorithms (6 messages):

FlashAttention, FA1, FA2, FA3, FA4


GPU MODE ▷ #beginner (4 messages):

Model optimization roadmap, Sparse convolution in ONNX Runtime, BEV fusion model


GPU MODE ▷ #irl-meetup (1 messages):

apaz: Now in NYC if anyone wants to meet up


GPU MODE ▷ #rocm (8 messages🔥):

rocSHMEM, ROCm-aware open MPI, HIP kernels, ROCm/iris


GPU MODE ▷ #webgpu (2 messages):

:catgirl5: emoji usage, thinking hard emoji


GPU MODE ▷ #self-promotion (1 messages):

GPU L2 Cache, Ampere Architecture, CUDA Project Structure, Persistent Memory Accesses


GPU MODE ▷ #reasoning-gym (1 messages):

Contributions Welcome, Prototype Sharing, Pull Requests


GPU MODE ▷ #submissions (1 messages):

MI300x8, amd-all2all leaderboard


GPU MODE ▷ #factorio-learning-env (6 messages):

Factorio Crafting Tool, FLE installation issues, Prototype Recipe Retrieval


GPU MODE ▷ #amd-competition (9 messages🔥):

CLI Tool vs Online Submission, ROCshmem Template, Web Version Organization, Online Testing Env Triton Support, Prize Registration Reminder


GPU MODE ▷ #singularity-systems (1 messages):

cuBLAS, ROCm, cuDNN, MIOpen


GPU MODE ▷ #general (21 messages🔥):

Pickling Errors, Serialization Issues, NaNs in Triton Kernels, Benchmarking Discrepancies


Latent Space ▷ #ai-general-chat (58 messages🔥🔥):

OpenAI Custom AI Chip, Mercor $10B Pre-emptive Offers, Augment (Augie) $85M Series A, OpenAI Responses API, Hugging Face FineVision Dataset


Latent Space ▷ #ai-announcements (4 messages):

AI Engineer CODE Summit 2025, NYC AI Event


Latent Space ▷ #genmedia-creative-ai (21 messages🔥):

Nano Banana, AI Girlfriend, AI Design Masterclass, Nvidia Cosmos DiffusionRenderer


DSPy ▷ #general (78 messages🔥🔥):

Voice Agents with DSPy, GEPA Optimization for Prompts, Multi-Turn Conversations, Groq for Inference, RAG vs Fine-tuning


Moonshot AI (Kimi K-2) ▷ #general-chat (75 messages🔥🔥):

Kimi K2 API Credits Giveaway, Anthropic API Integration, Kimi K2 Turbo Preview, Kimi K2 Model Performance, Kimi Starter Subscription


Nous Research AI ▷ #general (65 messages🔥🔥):

real time video AI, Spiking Neural Networks, cameras (image sensors) that are a bit closer to how the human eye works, Meta wristband reads body electricial signals to control smart glasses, Hermes's unique behavior in the husky holdem benchmark


Nous Research AI ▷ #interesting-links (7 messages):

Micro-LLM Experiments, SLM Agents by NVIDIA, Hermes Agent Size


Modular (Mojo 🔥) ▷ #mojo (60 messages🔥🔥):

Zig's async IO, Mojo's type system, MLIR, Vectorization of Loops, Compiler Customization


Eleuther ▷ #general (46 messages🔥):

MasterCard Fraud Prevention AI, Obscenity Rule Enforcement, Brand Risk Mitigation, AI-induced psychosis, Semantic Drift


Eleuther ▷ #research (6 messages):

GRPO Baseline, SFT + KL regularization


HuggingFace ▷ #general (37 messages🔥):

Reward Function Weighting in RL, Anthropic Policy on Jurisdiction Control, Causal Model Training with Attention Bias, Tokenizer and Attention Bias Implementation, RAG Applications with Limited Context Size


HuggingFace ▷ #today-im-learning (2 messages):

``


HuggingFace ▷ #i-made-this (1 messages):

Enron Email Dataset Parser, Structured Parquet Files, Email Analysis


HuggingFace ▷ #computer-vision (1 messages):

FastVLM


HuggingFace ▷ #smol-course (5 messages):

smol-course, GitHub Readme


HuggingFace ▷ #agents-course (2 messages):

agents course, greetings


tinygrad (George Hotz) ▷ #general (8 messages🔥):

Digital Ocean MI300X errors, Z3 version issues, Kernel removal project


aider (Paul Gauthier) ▷ #general (5 messages):

Warp Code, Aider's strengths, Aider success stories


aider (Paul Gauthier) ▷ #questions-and-tips (2 messages):

Coding Agent Refactoring, Aider's Code Validation, TreeSitter Validator


Yannick Kilcher ▷ #general (4 messages):

Baselines in Papers, LoRA Training


Yannick Kilcher ▷ #ml-news (1 messages):

erkinalp: https://www.all-hands.dev/blog/the-path-to-openhands-v1


Manus.im Discord ▷ #general (3 messages):

AI Politeness, Scientific Evidence for AI Politeness