Frozen AI News archive

Cognition's $10b Series C; Smol AI updates

**Cognition** raised **$400M** at a **$10.2B** valuation to advance AI coding agents, with **swyx** joining the company. **Vercel** launched an OSS coding platform using a tuned **GPT-5** agent loop. The **Kimi K2-0905** model achieved top coding eval scores and improved agentic capabilities with doubled context length. **Alibaba** released **Qwen3-ASR**, a multilingual transcription model with robust noise handling. **Meta** introduced Set Block Decoding for 3-5× faster decoding without architectural changes. Innovations in KV cache compression and quantization were highlighted, including **AutoRound** in SGLang and **QuTLASS v0.1.0** for Blackwell GPUs. Algorithmic benchmarking tools like **AlgoPerf v0.6** were updated for efficiency.

Canonical issue URL

A special update for Smol AI readers.

AI News for 9/5/2025-9/8/2025. We checked 12 subreddits, 544 Twitters and 22 Discords (187 channels, and 12661 messages) for you. Estimated reading time saved (at 200wpm): 1069 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

As leaked in July, the $10b round for Cognition was announced today. What we also announced was that I (swyx) will also be joining Cognition in some yet to be defined capacity, while AI Engineer and Latent Space remain independent. AINews will keep going as a personal project, with some conversations ongoing around its stable future.


AI Twitter Recap

Coding Agents and Tooling Momentum

Model and Inference Advances

Multimodal Generation, Video, and “Vibe Coding”

Agents, Post-Training RL, and Evaluation Practice

Robotics and Embodied AI

Benchmarks, Leaderboards, and Enterprise

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Open-source LLM Launches: K2 Think and TildeOpen 30B Multilingual

2. Local/Offline LLM Use on Personal Hardware (Dual RTX 6000 + M3 Mac)

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. AlterEgo wearable, Gemini 'Upload Any File', and Qwen Edit LORA launches

2. AI societal impacts: Anguilla .ai windfall, Hinton inequality warning, Grok Imagine adult-content gap

3. ChatGPT regression and investor-driven guardrails debate


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: New Models & Their Quirks

Theme 2: GPU Hardware & Performance Optimization

Theme 3: AI Agents & Development Tools in the Trenches

Theme 4: The AI Ecosystem: Legal Precedents and Content Crises

Theme 5: Cutting-Edge Research and Technical Deep Dives


Discord: High level Discord summaries

Unsloth AI (Daniel Han) Discord


LMArena Discord


Perplexity AI Discord


LM Studio Discord


Cursor Community Discord


GPU MODE Discord


Nous Research AI Discord


OpenAI Discord


HuggingFace Discord


Eleuther Discord


Latent Space Discord


Moonshot AI (Kimi K-2) Discord


DSPy Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


Manus.im Discord Discord


MLOps @Chipro Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (1352 messages🔥🔥🔥):

Grok 2.5, Colab's new UI, Qwen3, LoRA parameters


Unsloth AI (Daniel Han) ▷ #off-topic (391 messages🔥🔥):

RAG implementation, Used 4090 Availability, Nvidia RTX 5090, FineWeb Dataset


Unsloth AI (Daniel Han) ▷ #help (486 messages🔥🔥🔥):

Gemma 3 fast_inference, Llama.cpp convert_hf_to_gguf.py ValueError, datasets for fine-tuning, VRAM issues pip 8.10 vs 9.1, continue training from a certain checkpoint


Unsloth AI (Daniel Han) ▷ #showcase (46 messages🔥):

Psychological implications of LLMs, Open Source Release, AI Therapist, OpenAI Chart Crime, Medical Reasoning Model


Unsloth AI (Daniel Han) ▷ #research (41 messages🔥):

Vision Models, GRaPE Mini Beta, VoRA, RSLoRA


LMArena ▷ #general (980 messages🔥🔥🔥):

Amazon Titan, Llama 4, Superintelligence lab, GPT5 vs Claude Opus 4.1 Thinking, AI Therapists


LMArena ▷ #announcements (6 messages):

Video Arena Discord Bot, User Login & Rate Limits, New Model Update, Multi-Turn for Image Edit


Perplexity AI ▷ #general (1174 messages🔥🔥🔥):

Comet Browser, Qwen-3MAX is a Reasoning Model?, DeepSeek, New XAI models Dusk and Sky, Grok co-founder


Perplexity AI ▷ #sharing (7 messages):

Shareable Threads, Perplexity Browser Claim


Perplexity AI ▷ #pplx-api (3 messages):

sonar-pro model issues


LM Studio ▷ #general (226 messages🔥🔥):

Suno4.5PLUS song generation, LLM Translation, GPT-OSS-20B Quantization, CodexCLI with LMStudio, LMStudio Performance Issues


LM Studio ▷ #hardware-discussion (393 messages🔥🔥):

Used 3090s, 5090 melting fears, Prompt processing bottlenecks, Copilot vs Local Models for agentic coding, Copilot quality swap with Cursor


Cursor Community ▷ #general (503 messages🔥🔥🔥):

Debug PR masterpiece, Figma loading tools fix, Whitespace-only changes with Cursor agent, Qwen model request, Broken Windows Subsystem for Linux (WSL)


GPU MODE ▷ #general (26 messages🔥):

Triton Implementation of Attention Mechanisms, Flashdecoding Parallelism Strategy, NVIDIA's Tilus vs Triton, Jane Street Hackathon Overhears


GPU MODE ▷ #triton (9 messages🔥):

Triton for CUDA/GPU Noobs, Hopper Optimizations in Triton (wgmma, tma), FA3 Performance on Hopper, Compile Triton Kernel to CUDA TTIR/TTGIR on non-CUDA machine


GPU MODE ▷ #cuda (4 messages):

Barnes Hut Implementations, Memory access best practices, L1 Cache Efficiency, Buffer Load Optimization


GPU MODE ▷ #torch (13 messages🔥):

Decoder Layer Slowdown, RMSNorm Performance, torch._grouped_mm Documentation, ONNX Limitations


GPU MODE ▷ #cool-links (2 messages):

Shared Memory Bank Conflicts, LDS Instruction Latency


GPU MODE ▷ #beginner (2 messages):

CUDA Tensors, CUDA Models, GPU Acceleration


GPU MODE ▷ #pmpp-book (6 messages):

Device Shared Declaration, PMMP 4th Edition Worth it?, PMMP Edition Diffs


GPU MODE ▷ #off-topic (7 messages):

Noodle dish, TV Input Latency Experiment, Hidden gem anime, Fine-grained classification benchmark blog post


GPU MODE ▷ #irl-meetup (7 messages):

NYC Meetup, Hackathon Venue, Registration Confirmation


GPU MODE ▷ #triton-puzzles (20 messages🔥):

Triton Import Error, Numpy Version Issue, Colab Session Restart


GPU MODE ▷ #rocm (4 messages):

mpi4py issues, iris, ROCm, pytorch


GPU MODE ▷ #lecture-qa (1 messages):

RDNA3 MatMul, seb-v's talk


GPU MODE ▷ #webgpu (3 messages):

Dawn Support, WGVK Compilation


GPU MODE ▷ #metal (2 messages):

Metal Documentation, simdgroup matmul


GPU MODE ▷ #self-promotion (6 messages):

Model Serving Newsletter, Outlier Experiments, RegicideOS Testers, CuTe Swizzling, Tiny Diffusion Models


GPU MODE ▷ #🍿 (1 messages):

Custom OP Backend, DirectoryBackend Refactor, DSL Addition


GPU MODE ▷ #submissions (84 messages🔥🔥):

MI300x8 Performance, amd-all2all Leaderboard


GPU MODE ▷ #ppc (1 messages):

verspasian: <#1198358627594023014>


GPU MODE ▷ #factorio-learning-env (13 messages🔥):

FLE Repo, Call for help, Open World Scenario Error


GPU MODE ▷ #amd-competition (23 messages🔥):

AMD Registration Confirmation, Workflow File Dependencies, Triton Support, HIP Template, Team System


GPU MODE ▷ #cutlass (1 messages):

SM90 generator, levels based instantiation level, cmake flags


GPU MODE ▷ #singularity-systems (8 messages🔥):

pytorch backends, tinygrad's runtime, GPT2 Training


GPU MODE ▷ #general (6 messages):

vectoradd leaderboard, kernel implementations, AMD GPU Mode competition


GPU MODE ▷ #multi-gpu (14 messages🔥):

MPI dtypes, Heterogeneous computing, NCCL GPU communication, NVSHMEM on GitHub


GPU MODE ▷ #low-bit-training (12 messages🔥):

Triton Kernel Launch Overhead, Model Quantization Survey, Torch Compile with BFloat16


GPU MODE ▷ #jane-street-hackathon (65 messages🔥🔥):

Triton hacking, GPU kernels with CUDA, VS Code SSH instance, Torch Compile Flags, Continuous Batching


Nous Research AI ▷ #general (320 messages🔥🔥):

Deepmind and Huawei B. Neural Network progress, US AI regulation impact on open models, Unsloth fixes LoRA training, Huawei GPU, Hermes 4 model


Nous Research AI ▷ #ask-about-llms (14 messages🔥):

Hermes Censorship, Uncensoring difficulties, HR values in US models, OpenAI internal models


Nous Research AI ▷ #research-papers (2 messages):

BOS token limitations, Fine-tuning from EOS, Crumb essence-3b-v1 Model


Nous Research AI ▷ #interesting-links (1 messages):

real.azure: nice!

I already find the new Qwen3-4B to be super impressive.


Nous Research AI ▷ #research-papers (2 messages):

BOS Token Accumulation, EOS Finetuning, Crumb's Essence-3b-v1


OpenAI ▷ #ai-discussions (266 messages🔥🔥):

AI video generator free plan strategy, Perplexity Pro plan offer, GPT-5 vs 2.5 Pro, Grok as rogue AI, Setting up inbox filters with GPT agent


OpenAI ▷ #gpt-4-discussions (12 messages🔥):

Gemini Pro vs GPT-5, GPT-5 channel archived, GPT Regression, GPT Freezing


OpenAI ▷ #prompt-engineering (19 messages🔥):

GPT-5 rollout, Model instruction following, Web search API tips, Automotive logo design, SVG logos


OpenAI ▷ #api-discussions (19 messages🔥):

GPT-4o Performance, Steering API for web search, Model Changes, Logo Design with AI, SVG logos


HuggingFace ▷ #announcements (1 messages):

HF Hub Milestones, Trackio Features, Claude Image Generation, CUDA Kernel Guide, ZeroGPU Speed Improvements


HuggingFace ▷ #general (254 messages🔥🔥):

GPU script issues, Medgemma inference, abliterated model fine-tuning, Cohere research scholar program, Visual similarity scoring


HuggingFace ▷ #today-im-learning (3 messages):

Causality Handbook, Robotics SOTA, SmolVLA


HuggingFace ▷ #cool-finds (1 messages):

Software Development Roadmaps, Developer Roadmaps, github.com


HuggingFace ▷ #i-made-this (15 messages🔥):

DINOv3 for satellite imagery, Pathlint for code cleanup, Gemma3 from scratch, BwETAFv3 CLMs, Medical reasoning GPT


HuggingFace ▷ #reading-group (2 messages):

LLM Hallucinations, OpenAI Paper, Confidence Slider for LLMs, Dataset Recommendations


HuggingFace ▷ #core-announcements (1 messages):

H100s, Hopper-series GPUs, Flash Attention 3, Diffusers, ZeroGPU


HuggingFace ▷ #computer-vision (3 messages):

Dynamic Autoencoders, Image Padding, GANs Stability, Traditional Mud Emulation


HuggingFace ▷ #NLP (3 messages):

AI Training Costs, AI Agents in Production, Data Anonymization, Datatune


HuggingFace ▷ #smol-course (5 messages):

smol-course materials, AI/ML roadmap, smol course registration


HuggingFace ▷ #agents-course (8 messages🔥):

Introductions, Real World AI Agent project


Eleuther ▷ #general (126 messages🔥🔥):

AI-induced psychosis, Semantic drift, LLMs sycophancy, Logical Reasoning in LLMs, AI content in Google Search


Eleuther ▷ #research (62 messages🔥🔥):

Information Theory and Power Laws for Language Models and Stochastic Processes Paper Criticism, Compressing KV cache, Redundant functional motifs in neural networks, New 3T dataset from PDFs, Eval framework from Aleph Alpha


Eleuther ▷ #lm-thunderdome (1 messages):

Calibration Scores, LM Eval Harness


Eleuther ▷ #gpt-neox-dev (26 messages🔥):

QK Norm, RoPE vs NoPE, Gradient magnitudes, Pythia head size


Latent Space ▷ #ai-general-chat (124 messages🔥🔥):

Dot App Shutdown, Hashbrown v0.3 Release, Anthropic Copyright Settlement, Codex Team Podcast, AI Evals Debate


Latent Space ▷ #ai-announcements (12 messages🔥):

AI Engineer CODE Summit 2025, FAL AI Valuation, Latent Space Podcast


Latent Space ▷ #genmedia-creative-ai (29 messages🔥):

Image Model Comparisons, Nano Banana Model, Veo 3 Pricing, Hybrid AI Animation, Banana Straightener


Moonshot AI (Kimi K-2) ▷ #general-chat (127 messages🔥🔥):

Kimi K2 Research Uses, Kimi K1.5 vs Kimi K2, American AI vs Chinese AI, Perplexity User Base, EQ Bench accuracy


DSPy ▷ #show-and-tell (5 messages):

JTBD Validator, DSPy for Business Validation, Multi-Agent Systems with DSPy and GEPA, DSPy Weekly Newsletter, AI Agents Play Taboo


DSPy ▷ #papers (1 messages):

single-label classification, named entity recognition


DSPy ▷ #general (91 messages🔥🔥):

VibeVoice Repo, Async Speedup, Nano Banana Hackathon, Data Hygiene + Eval Reflexivity, DSPy Project Structure


Modular (Mojo 🔥) ▷ #general (17 messages🔥):

Apple GPU, Mojo Use Cases Beyond AI, Ray Tracing in Mojo, Community Meeting


Modular (Mojo 🔥) ▷ #mojo (39 messages🔥):

AMD GPU Issue, ROCm Version, Tier 3 GPU Support, EmberJson Explicit Copies, Dict API Improvements


Modular (Mojo 🔥) ▷ #max (31 messages🔥):

SDK Cache TensorValue, ModuleV3, max.tensor symbolic dimensions, Model Serialization with Pickle, MAX model format


Yannick Kilcher ▷ #general (48 messages🔥):

Low Rank Updates vs Replacement, Sparsity and Quantization in LLMs, Distillation and Model Complexity, Codex IDE and Code Generation, Arxiv Paper without Empirical Results


Yannick Kilcher ▷ #agents (6 messages):

RAG agent resources, Langchain alternatives, While loops for agents


Yannick Kilcher ▷ #ml-news (13 messages🔥):

In-Memory Computing, OpenAI Jobs Platform, ASML Investing in LLMs, Custom Pre-trained Models, Mistral's Profitability


aider (Paul Gauthier) ▷ #general (33 messages🔥):

Aider Success Stories, Aider vs Fully Agentic Tools, Token Efficiency with Codex, GPT-5 with Aider, Fast and Effective Models for Web Dev


aider (Paul Gauthier) ▷ #questions-and-tips (5 messages):

Aider Code Generation Percentage, Aider's Safety Mechanisms, Reasoning Effort and Edit Actions, Confirmation System in Aider, Linting Configuration


tinygrad (George Hotz) ▷ #general (20 messages🔥):

Kernel Removal Project, Digital Ocean Issues, ShapeTracker Bounty Removal, Tinygrad Community Bounties, Meeting #87 Topics


tinygrad (George Hotz) ▷ #learn-tinygrad (2 messages):

graph_rewrite_map(), Tensor vs MathTrait


Manus.im Discord ▷ #general (20 messages🔥):

Manus bugging out, Manus API key, New MCP and API connectors, Flowith invitation, Politeness to AIs