Frozen AI News archive

DeepSeek V3.1: 840B token continued pretrain, beating Claude 4 Sonnet at 11% of its cost

**DeepSeek** released **DeepSeek V3.1**, a quietly rolled out open model with an **128K context window** and improvements in **token efficiency**, coding, and agentic benchmarks. **ByteDance** launched the permissive **Seed-OSS 36B** model on Hugging Face, noted for long-context and reasoning capabilities. **Zhipu AI** introduced **ComputerRL**, a reinforcement learning framework for computer-use agents, achieving strong benchmark results. In developer tooling, **GitHub Copilot** expanded globally, **Microsoft VS Code** integrated **Gemini 2.5 Pro** and updated **GPT-5** agent prompts, and **Anthropic** launched **Claude Code** seats with spend controls. Open-source fine-tuning advances include **Together AI** adding SFT for **gpt-oss-120B/20B** and **Baseten** enabling multinode 120B training with Truss CLI. The community noted mixed performance and ongoing post-training adjustments for DeepSeek V3.1.

Canonical issue URL

sorry for the late post, deepseek's official post was quite late

AI News for 8/19/2025-8/20/2025. We checked 12 subreddits, 544 Twitters and 29 Discords (229 channels, and 6600 messages) for you. Estimated reading time saved (at 200wpm): 517 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

As discussed yesterday, DeepSeek followed up their characteristic model release with a remarkably low key tweet and blogpost which released their official messaging and evals:

The standard knowledge benchmark bumps are incremental:

but there are important improvements in coding and agentic benchmarks that make it more useful for agents.

However the major story may be even more subtle - token efficiency improvements!

the Reddit dissection of DSV3.1 is particularly strong, so just scroll on down.


AI Twitter Recap

China’s open models and agents: DeepSeek V3.1, ByteDance Seed‑OSS 36B, Zhipu’s ComputerRL

Coding agents and developer tooling

Agent training and RL: scaling recipes that matter

Benchmarks, evaluation quality, and systems scaling

Vision and multimodal editing: Qwen Image Edit takes the crown

Product velocity and usage: Perplexity scale-up, Claude Code in orgs, GPT‑5 UX split, Google’s AI phones

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. DeepSeek V3.1 Updates, Efficiency and Head-to-Head Benchmarks

2. New Open-Source Model Launches: IBM/NASA Surya and ByteDance Seed-OSS-36B

3. Indie Open-Source Innovations: Mobile AndroidWorld Agent and TimeCapsuleLLM (1800s London)

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Unitree and Boston Dynamics Humanoid Robot Updates

2. Image Edit Model Benchmarks and Workflows (Qwen, WAN 2.2, Image Edit Arena)

3. Veo-3 AI Video Generation Demos and Guides


AI Discord Recap

A summary of Summaries of Summaries by X.ai Grok-4

Theme 1. Model Mayhem: Releases and Rivalries Rock Leaderboards

Theme 2. Fine-Tuning Frenzy: GRPO and Datasets Drive Tweaks

Theme 3. Hardware Havoc: GPUs Battle for AI Supremacy

Theme 4. Tooling Turmoil: APIs and Agents Evolve Amid Bugs

Theme 5. Industry Intrigue: Valuations, Talent Wars, and AI Returns


Discord: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LMArena Discord


Cursor Community Discord


LM Studio Discord


OpenRouter (Alex Atallah) Discord


OpenAI Discord


GPU MODE Discord


Latent Space Discord


Nous Research AI Discord


HuggingFace Discord


Moonshot AI (Kimi K-2) Discord


Yannick Kilcher Discord


LlamaIndex Discord


MCP (Glama) Discord


aider (Paul Gauthier) Discord


Modular (Mojo 🔥) Discord


Notebook LM Discord


tinygrad (George Hotz) Discord


DSPy Discord


Cohere Discord


Manus.im Discord Discord


LLM Agents (Berkeley MOOC) Discord


MLOps @Chipro Discord


Nomic.ai (GPT4All) Discord


Gorilla LLM (Berkeley Function Calling) Discord


The Torchtune Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1232 messages🔥🔥🔥):

amazon.in not working, comet invites, GPTs Agents, OpenAI's sidebars, GPT Go Plan


Perplexity AI ▷ #sharing (4 messages):

Shareable Threads, Perplexity AI Newsletter, Sorion Unicode Tool


Perplexity AI ▷ #pplx-api (5 messages):

Perplexity API Status, API Groups Deletion


Unsloth AI (Daniel Han) ▷ #general (1028 messages🔥🔥🔥):

Unsloth App Updated to 3.1, GRPO Applied to Llama Model, VRAM Issues with Qwen3-4b-Instruct, Dataset Tools and Workflows, Blackwell RTX 50 Series and Unsloth Guide


Unsloth AI (Daniel Han) ▷ #introduce-yourself (1 messages):

Discord showing gaming handle, Privacy Concerns


Unsloth AI (Daniel Han) ▷ #off-topic (29 messages🔥):

ASUS ROG Matrix GeForce RTX 5090 30th Anniversary Limited Edition, CUDA out of memory


Unsloth AI (Daniel Han) ▷ #help (77 messages🔥🔥):

Gemma 3 270m CPT, Runpod setup, GPU requirements, DeepSeek V3.1 Quantization, Transformers version issue


Unsloth AI (Daniel Han) ▷ #showcase (18 messages🔥):

Gemma 3 4b finetune, Un-sycophantic BERT models, Swahili Gemma 1B, OpenHelix dataset


Unsloth AI (Daniel Han) ▷ #research (17 messages🔥):

L40S vs A100 Inference, GRPO for Llama, Qwen3-4B Finetuning with Unsloth


LMArena ▷ #general (1025 messages🔥🔥🔥):

Nano Banana Launch, Gemini 2.5 Pro vs GPT-5, DeepSeek v3.1 Issues, Image upload issues on LMArena


LMArena ▷ #announcements (2 messages):

Qwen-Image-Edit, Image Edit Leaderboard, LMArena


Cursor Community ▷ #general (859 messages🔥🔥🔥):

Sonic model, Token usage, Multi-agent setup, Code Quality


Cursor Community ▷ #background-agents (14 messages🔥):

API Key Authorization Issues, System Timeout Issues, NPM Package Pull Failure, Background Agent Functionality Issues, Git Configuration Problems


Cursor Community ▷ #announcements (1 messages):

New stealth model in Cursor, Partnered Model


LM Studio ▷ #general (305 messages🔥🔥):

Qwen3 BF16, GPT-OSS 20B, CUDA setup, Nvidia vs. AMD, AgentMode


LM Studio ▷ #hardware-discussion (78 messages🔥🔥):

Bolt Graphics AI roadmap, 3090 vs alternatives, AMD mi50 32gb, Qwen3-30ba3b on mi50, 1M context Q3-30B


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Activity Analytics API, Allowed Models API, OpenRouter Developer APIs


OpenRouter (Alex Atallah) ▷ #general (211 messages🔥🔥):

OpenWebUI Memory Feature, GPT-5 Context Issues, Deepseek v3.1 Availability on OpenRouter, Stealth Model Speculation (Grok-4 Code), Free Model Options on OpenRouter


OpenRouter (Alex Atallah) ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter (Alex Atallah) ▷ #discussion (12 messages🔥):

LLMs Formatting Output, AFR Chanticleer AI Report, Google Gemini Models, OpenAI-standard complex content format, tool calling flows


OpenAI ▷ #ai-discussions (148 messages🔥🔥):

Gemini Storybook mode, AI Bots paying each other, Decentralized AI BOINC-style project, AI Moderation


OpenAI ▷ #gpt-4-discussions (15 messages🔥):

GPT Custom Actions, Vanished GPT, GPT5 Conversations, AI Agents and Workflows, AGI Arms Race


OpenAI ▷ #prompt-engineering (4 messages):

SCRIBE Prompt, Audio Steganography in Prompts, Model Interpretation of Prompts, Prompt Deconstruction, Impact of Language on Model Output


OpenAI ▷ #api-discussions (4 messages):

SCRIBE prompt analysis, Model understanding of complex prompts, Impact of language on model output, Prompt evaluation techniques


GPU MODE ▷ #general (84 messages🔥🔥):

Hackathon Invites, CUDA Kernel Optimization, Alienware R15, Sesh Bot Discord Calendar Sync


GPU MODE ▷ #cuda (2 messages):

cudaMemcpyAsync with late-bound addresses, NCU profiling issues with live register data, Kernel register pressure analysis


GPU MODE ▷ #jobs (6 messages):

SemiAnalysis job posting, New Grad Engineer, Performance engineering, CI/CD pipelines, LinkedIn tracking links


GPU MODE ▷ #beginner (6 messages):

CUDA setup on Ubuntu, AI companies and databases like ClickHouse, Embedding speeds of infinity server vs sglang


GPU MODE ▷ #youtube-recordings (1 messages):

stoicsmm: hi everyone


GPU MODE ▷ #torchao (1 messages):

topsy1581: Hi, does the TorchAO support grouped gemm for MXFP8 x MXFP8, or MXFP4 x MXFP4?


GPU MODE ▷ #off-topic (4 messages):

Gouda content, Mods are asleep


GPU MODE ▷ #irl-meetup (1 messages):

veer6174: Anyone here in Bangkok?


GPU MODE ▷ #triton-puzzles (3 messages):

Triton-Puzzles issues, torch 2.5.0 installation, numpy downgrade


GPU MODE ▷ #self-promotion (1 messages):

hariprasathvinayagam: try it


GPU MODE ▷ #🍿 (1 messages):

veer6174: Anyone setup emacs to edit google collab?


GPU MODE ▷ #reasoning-gym (1 messages):

SkyRL, ReasoningGym Integration


GPU MODE ▷ #submissions (2 messages):

A100 Leaderboard, MI300 Leaderboard


GPU MODE ▷ #factorio-learning-env (8 messages🔥):

Factorio Mods, FLE, Registry.py, Friday Meeting


GPU MODE ▷ #cutlass (1 messages):

Arithmetic types, TensorSSA objects, cute.full_like, wrapping logic


GPU MODE ▷ #singularity-systems (1 messages):

j4orz: updates to the book. prelims and appendices.


GPU MODE ▷ #multi-gpu (5 messages):

NCCL, ND-parallelism, GPU Parallelism Abstraction


Latent Space ▷ #ai-general-chat (65 messages🔥🔥):

xAI Talent Exodus, Anthropic Claude TOS violation, Internally Deployed Engineer, OpenAI Valuation, Responses API


Latent Space ▷ #genmedia-creative-ai (8 messages🔥):

PhotoAI orchestrating AI models, Wonda AI agent launch, AI for video generation


Nous Research AI ▷ #general (63 messages🔥🔥):

Deepseek thinking efficiency eval, GLM 4.5 V, Z.ai OS, xAI terrifies, DeepSeek V3.1 Base Discussions


Nous Research AI ▷ #ask-about-llms (3 messages):

Custom OpenAI endpoints


Nous Research AI ▷ #research-papers (3 messages):

Token Efficiency Study, AutoThink Evaluation


Nous Research AI ▷ #interesting-links (1 messages):

Open Source AI, Alignment Lab


Nous Research AI ▷ #research-papers (3 messages):

Token Efficiency, AutoThink Evaluation


HuggingFace ▷ #announcements (1 messages):

Ultra-Scale Playbook book, TEI v1.8.0 release, GLM4.5V transformers support, Google Gemma 3 270M, SAM2 in HF transformers


HuggingFace ▷ #general (51 messages🔥):

Affordable voice assistant, distutils.ccompiler error, transformers.js script, HF Team contact, Humor Genome Project


HuggingFace ▷ #i-made-this (1 messages):

On-Device Android App, LFM2-350M Model, Mobile AI, HuggingFace Models


HuggingFace ▷ #computer-vision (2 messages):

Jax Image Modeling, Vision Transformers, CLIP, SigLIP, DINOv2/v3


HuggingFace ▷ #smol-course (3 messages):

llama.cpp documentation


Moonshot AI (Kimi K-2) ▷ #general-chat (57 messages🔥🔥):

Deepseek vs Kimi K2, Moonshot AI merch, AI Gold Rush, Scam Alert


Yannick Kilcher ▷ #general (29 messages🔥):

LSTM vs Transformers, Bias Variance Tradeoff, Fast Inference for Sales Forecasting, Mamba Vision Optimization, ARC-AGI 1


Yannick Kilcher ▷ #paper-discussion (12 messages🔥):

VLM Chart Understanding Dataset, VLM Struggle Discussion, Personality GAN, Jester personality type, AI welfare


Yannick Kilcher ▷ #ml-news (5 messages):

AI Model Prompt Generation, Internal AGI, Yann LeCun's position at FAIR, Zuckerberg threads post


LlamaIndex ▷ #blog (2 messages):

StackAI, LlamaCloud, custom retrievers, generic vector search, domain-specific context


LlamaIndex ▷ #general (42 messages🔥):

Email Agent System Prompts, LlamaParse Extraction Errors, Terminating Running Workflows, Spreadsheet Agent Beta Release, Sync React Agents vs Async


LlamaIndex ▷ #ai-discussion (1 messages):

Self-hostable Knowledge Base, Qdrant Integration, Company-Wide Knowledge Base for AI


MCP (Glama) ▷ #general (37 messages🔥):

MCP Web App for Claude, Input Token Optimization with Claude 3.5 Sonnet, Self-Signed Certificate Error in Inspector, Aspire Inspector Configuration, MCP Server Information


MCP (Glama) ▷ #showcase (4 messages):

AI Agents as Insider Threats, MCP Server Vulnerabilities, Agentic Project Management (APM), Cloudship AI Station, MCPresso CLI for Server Development


aider (Paul Gauthier) ▷ #general (25 messages🔥):

Qwen3-Coder Performance, Aider and Tool Calling, Gemini 2.5 Pro Issues, Git Index Version Error, Blockchain Developer Availability


aider (Paul Gauthier) ▷ #questions-and-tips (10 messages🔥):

LiteLLM verbosity, Aider workflow, Model Aliases, Program output, Polyglot Benchmark


Modular (Mojo 🔥) ▷ #mojo (25 messages🔥):

GPU crashing issues, Synchronization barriers in Mojo, GPU P2P enabling, Mojo documentation and learning resources, Memory alignment in Mojo


Modular (Mojo 🔥) ▷ #max (5 messages):

Max + Modal Integration, Torch Max Backend, TextGenerationPipeline


Notebook LM ▷ #use-cases (9 messages🔥):

Spotify Podcast, Proto-Germanic AI Translation, Discord Moderation Needed, NotebookLM for Tabletop RPGs, GEMS sitrep


Notebook LM ▷ #general (20 messages🔥):

Youtube links import, Mobile App Offline Capability, Audio overview customization, NLM and PDF Images, Notebook sharing statistics


tinygrad (George Hotz) ▷ #general (13 messages🔥):

Hiring for Tests, CI Speed, Process Replay Multiprocessing, Linux (nv) vs Linux (ptx) Performance, Overworld Constant Folding


DSPy ▷ #general (10 messages🔥):

TIL cost is returned even if it's cached, optimiser which does a form of cross-validation, extract prompts from the optimized program


Cohere ▷ #👋-introduce-yourself (4 messages):

Introductions


Manus.im Discord ▷ #general (3 messages):

Manus credits, Backups, Provider switch


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

Next Cohort, Cohort signups


MLOps @Chipro ▷ #events (1 messages):

Functional Python for AI/ML, Persistent Memoization, Deterministic Parallelism, DataPhoenix


Nomic.ai (GPT4All) ▷ #general (1 messages):

Blockchain Development, DEXs, Trading Bots, Smart Contracts, DApp Frontends


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

PelicanVLM-72B-Instruct, BFCL Tool Evaluation