Frozen AI News archive

Cohere Command A Reasoning beats GPT-OSS-120B and DeepSeek R1 0528

**Cohere's Command A Reasoning** model outperforms GPT-OSS in open deep research capabilities, emphasizing agentic use cases for 2025. **DeepSeek-V3.1** introduces a hybrid reasoning architecture toggling between reasoning and non-reasoning modes, optimized for agentic workflows and coding, with extensive long-context pretraining (~630B tokens for 32k context, ~209B for 128k), FP8 training, and a large MoE expert count (~37B). Benchmarks show competitive performance with notable improvements in SWE-Bench and other reasoning tasks. The model supports a $0.56/M input and $1.68/M output pricing on the DeepSeek API and enjoys rapid ecosystem integration including HF weights, INT4 quantization by Intel, and vLLM reasoning toggles. Community feedback highlights the hybrid design's pragmatic approach to agent and software engineering workflows, though some note the lack of tool use in reasoning mode.

Canonical issue URL

A new SOTA open model.

AI News for 8/20/2025-8/21/2025. We checked 12 subreddits, 544 Twitters and 29 Discords (229 channels, and 7429 messages) for you. Estimated reading time saved (at 200wpm): 605 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

We last checked in on Cohere's Command A in March and then again last week with their $7B series D, but we didn't think we'd be talking about Cohere again so soon - Command A Reasoning puts GPT-OSS to shame according to Cohere's own evals:

Importantly for the killer agentic use case of 2025, it is a very decent open deep research model:


AI Twitter Recap

DeepSeek V3.1: hybrid reasoning release, agent focus, and early results

Cohere’s Command A Reasoning and other new reasoning models

Google AI: Gemini efficiency paper, agentic Search, Veo access, and Gov platform

Reasoning, RL, and evals: new methods and benchmarks

Systems and tooling: APIs, serving, and dev infra

Research highlights (vision, multimodal, 3D, embodied)

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. DeepSeek V3.1: Anthropic API compatibility + thinking-mode benchmarks

2. Model releases/ports: DeepSeek-V3.1 HF card and Kimi-VL-A3B-Thinking GGUF (llama.cpp PR #15458)

3. Efficiency & scaling: 1–8 bit quantization guide, 100k H100 under-scaling, and 160GB VRAM local build

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo


AI Discord Recap

A summary of Summaries of Summaries by X.ai Grok-4

Theme 1. DeepSeek V3.1 Drops with Mixed Vibes

Theme 2. ByteDance's Seed-OSS Models Stir Buzz

Theme 3. Hardware Hurdles and Benchmarks Heat Up

Theme 4. Training Tricks and Datasets Dominate

Theme 5. Industry Shifts and Safety Snags


Discord: High level Discord summaries

LMArena Discord


Unsloth AI (Daniel Han) Discord


OpenRouter (Alex Atallah) Discord


Cursor Community Discord


LM Studio Discord


OpenAI Discord


Eleuther Discord


Latent Space Discord


GPU MODE Discord


Yannick Kilcher Discord


HuggingFace Discord


Notebook LM Discord


Nous Research AI Discord


Moonshot AI (Kimi K-2) Discord


aider (Paul Gauthier) Discord


DSPy Discord


Cohere Discord


MCP (Glama) Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


Manus.im Discord Discord


tinygrad (George Hotz) Discord


Nomic.ai (GPT4All) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Torchtune Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (951 messages🔥🔥🔥):

nano-banana model, Video Arena problems, DeepSeek V3.1, Gemini 3


LMArena ▷ #announcements (2 messages):

Video Arena Bot, Deepseek v3.1, LMArena Models


Unsloth AI (Daniel Han) ▷ #general (887 messages🔥🔥🔥):

ByteDance Seed Model, GRPO Training, DeepSeek V3.1 Quants, Nvidia's GPUs and Pricing, GLM-4.5 Cline Integration


Unsloth AI (Daniel Han) ▷ #introduce-yourself (1 messages):

.zackmorris: Hello


Unsloth AI (Daniel Han) ▷ #off-topic (27 messages🔥):

GRPO 20mb alloc fail, ChatGPT's deep research, Grok-4, Repetition penalty, RAG


Unsloth AI (Daniel Han) ▷ #help (101 messages🔥🔥):

Retinal Photo Training Strategies, GPT-OSS 20B Deployment on Sagemaker, Unsloth Zoo Issues, GGUF Loading with Unsloth, Gemma 3 Vision Encoder Training Loss


Unsloth AI (Daniel Han) ▷ #showcase (11 messages🔥):

WildChat-4M-English-Semantic-Deduplicated dataset, Behemoth-R1-123B-v2 model, GPU Rich Flex


Unsloth AI (Daniel Han) ▷ #research (7 messages):

Qwen3-4B finetuning, TTS with Gemini 270m, Mixture Models, JetMoE, BAM


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Cloudflare outage, Generations API stability


OpenRouter (Alex Atallah) ▷ #app-showcase (4 messages):

OpenRouter Cost Dashboard, Average Request Size, Gemini Input Token Calculation


OpenRouter (Alex Atallah) ▷ #general (528 messages🔥🔥🔥):

Deepseek pricing, OpenRouter rate limits, Gemini banning, Using OpenRouter with RAG systems, 4.6T parameter model


OpenRouter (Alex Atallah) ▷ #new-models (3 messages):

``


OpenRouter (Alex Atallah) ▷ #discussion (16 messages🔥):

Qwen3 coder 480b, DeepSeek v3 0324, Zero return from generative AI, Google Gemini 400 Error, Cohere reasoning model


Cursor Community ▷ #general (432 messages🔥🔥🔥):

Claude Cache Reads, Sonic Model origin, Open Sourcing Agentwise, Cursor API costs with Auto agent, DeepSeek V3.1


Cursor Community ▷ #background-agents (11 messages🔥):

Agent Auditing, MySQL Installation in Background Agents, Background Task Errors, Remote IDE connection to Background Agent


LM Studio ▷ #general (141 messages🔥🔥):

CUDA Errors with 4070 TI Super, LM Studio multi-GPU performance, SerpAPI integration with LM Studio, GPT-OSS Performance, Model parameter configuration for VRAM usage


LM Studio ▷ #hardware-discussion (54 messages🔥):

Z390 Designare vs Threadripper/Epyc, Qwen3-30B-A3B-Instruct-2507-GGUF Benchmarks, Model M Buckling Spring Keyboards, GGUF vs MLX on Apple M4 Max, Running GPT-OSS-20b on Apple M1


OpenAI ▷ #ai-discussions (167 messages🔥🔥):

Machine-to-Machine Economies, AI safeguards, Decentralized AI projects, Few-shot examples for Large Prompts, GPT-5's Direct Responses


OpenAI ▷ #gpt-4-discussions (9 messages🔥):

GPT-4 projects UI files, AI court legal case, Android app development with GPT, Token usage for uploaded content, GPT server issues


OpenAI ▷ #prompt-engineering (6 messages):

AI Quiz generation, GPT models quitting


OpenAI ▷ #api-discussions (6 messages):

AI Generated Quizzes, GPT-5 Random Quitting, Plausible Response Options, LLM Stochasticity


Eleuther ▷ #general (96 messages🔥🔥):

PileT5-XL embeddings as instructions, Networks that process in latent space, Multimodal generative models, image editing models, Latent space editing


Eleuther ▷ #research (54 messages🔥):

SSL objectives, Medical event pretraining, Noise-data trajectories, ByteDance's Prover, Unfriendly Activation Steering


Eleuther ▷ #scaling-laws (1 messages):

Model Overtraining, Token Repetition in Models


Eleuther ▷ #interpretability-general (11 messages🔥):

Qwen3 Training, Weight lifting from llama series, Head isolation


Eleuther ▷ #gpt-neox-dev (2 messages):

Muon Support, Slurm Script for NeoX Job with Docker


Latent Space ▷ #ai-general-chat (83 messages🔥🔥):

Meta AI Reorg, GPT-5-pro truncation, Bank Teller Rotations Inspired Dropout, Meta AI Hiring Freeze, ByteDance Seed-OSS LLMs


Latent Space ▷ #genmedia-creative-ai (13 messages🔥):

Wonda AI, Billionaires Fight Club, Qwen Image Editing


GPU MODE ▷ #general (25 messages🔥):

Hackathon start time, ChatGPT CUDA lies, Hackathon prerequisites, Single huge epoch vs multiple smaller epochs, CUDA vs Triton


GPU MODE ▷ #triton (1 messages):

Triton, AMD, NVIDIA, GPU, Data Layout


GPU MODE ▷ #cuda (10 messages🔥):

CUDA deployment, CudaWrangler, Dynamic Linking


GPU MODE ▷ #torch (1 messages):

PyTorch Contributor Awards 2025, Recognizing Innovation in PyTorch


GPU MODE ▷ #beginner (1 messages):

honeyspoon: how bad is the infinity server for embedding speeds compared to something like sglang


GPU MODE ▷ #off-topic (1 messages):

snektron: I prefer Stolwijker


GPU MODE ▷ #rocm (11 messages🔥):

AMD GPU debugger, rocGDB, SPIRV parser, libspirv


GPU MODE ▷ #metal (2 messages):

C=AB matmul, ALU utilization, buffer read bandwidth, float4x4 matmul, float4 / metal::dot kernel


GPU MODE ▷ #reasoning-gym (1 messages):

miserlou1241: Very cool!


GPU MODE ▷ #general-leaderboard (12 messages🔥):

torch.compile errors, local evaluation issues


GPU MODE ▷ #submissions (11 messages🔥):

Trimul Leaderboard Updates, B200 Performance, H100 Performance, MI300 Performance


GPU MODE ▷ #factorio-learning-env (3 messages):

Opus 4.1, Steel Plate Production, Task Emphasis, Red Science Production


GPU MODE ▷ #cutlass (3 messages):

ND Layouts, colex


GPU MODE ▷ #multi-gpu (10 messages🔥):

Infiniband at home, Distributed training library, NCCL backend, IBGDA requirements


Yannick Kilcher ▷ #general (33 messages🔥):

Infinite Memory, Arxiv paper guide, LLMs for Legal Field, HRM Models Analysis, Message Passing Approaches


Yannick Kilcher ▷ #paper-discussion (46 messages🔥):

Personality GAN, AI Welfare, Genome Conscious?, Super Weight, LLM Preferences


Yannick Kilcher ▷ #ml-news (8 messages🔥):

Yann LeCun's position at FAIR, Thermodynamic computing chip, AI Slurs, Energy Efficiency in AI


HuggingFace ▷ #general (67 messages🔥🔥):

max_steps confusion, levelbot space visits, model hallucination at high tokens, Pro version payment issues, root mean square norm quantization error


HuggingFace ▷ #i-made-this (3 messages):

AgentX Trading Platform, Language Diffusion Models, Local AI Workspace PDF Reader


HuggingFace ▷ #NLP (1 messages):

Hugging Face Learn course, 422 Error


HuggingFace ▷ #agents-course (4 messages):

Hugging Face Certificates, Agents vs MCP Course, Agent tool, LLM tasks


Notebook LM ▷ #use-cases (19 messages🔥):

Gems for podcast generation, NotebookLM podcast length, Customizing NotebookLM podcasts, Analyzing Terms of Use and Privacy Policies, South Park episode on Terms and Conditions


Notebook LM ▷ #general (51 messages🔥):

Video Length Limits, Study guide on android app, Audio Language Change, Public Sharing Issue, Notebook LM API


Nous Research AI ▷ #general (65 messages🔥🔥):

Base Model Release, Ideal 30B Model, FA2 and Context, Qwen Scaling, Importance Matrix Calibration Datasets


Moonshot AI (Kimi K-2) ▷ #general-chat (47 messages🔥):

DeepSeek V3.1, R-Zero LLM Training Method, Energy availability in China vs US, Kimi K2 combined with Better image gen than gpt 5


aider (Paul Gauthier) ▷ #general (36 messages🔥):

Gemini 2.5 Pro Failure, Qwen CLI Charging, GPT-5 Benchmarks, DeepSeek v3.1 Pricing, OpenRouter Think Mode


aider (Paul Gauthier) ▷ #questions-and-tips (3 messages):

aider stdout issue, polyglot benchmark on llama cpp


aider (Paul Gauthier) ▷ #links (1 messages):

end4749: <@293486003245809664> spam? ^


DSPy ▷ #show-and-tell (1 messages):

marimo notebooks, Graph RAG with DSPy, DSPy modules optimization


DSPy ▷ #papers (5 messages):

IBM AutoPDL paper, DSPy code readability, Justification of work


DSPy ▷ #general (28 messages🔥):

dspy.GEPA version, finetuning dspy descriptions, saving optimized programs, context length for GEPA, KPMG onboarding


Cohere ▷ #🧵-general-thread (13 messages🔥):

Citation issues with command-a-03-2025, Guaranteed citations, command-a-reasoning release, RAG with Langchain, Cohere vs Qwen3-coder 30B


Cohere ▷ #📣-announcements (1 messages):

Command A Reasoning Model, Enterprise AI, Agentic AI Platform


Cohere ▷ #🔌-api-discussions (4 messages):

Cohere Embed-v4 on Azure AI Foundry, Cohere Python Library Document Object


Cohere ▷ #👋-introduce-yourself (7 messages):

MLE Research, Independent Interpretability Research, AI Innovation and Value Creation, Enterprise Workflows


MCP (Glama) ▷ #general (12 messages🔥):

C# client library, MCP server's instructions field, MCP servers, generate_test_prompt.md, GitHub


MCP (Glama) ▷ #showcase (10 messages🔥):

Web-curl, MCP-Boss, MCP Explained Video, SWAG-MCP, MCP Routing


Modular (Mojo 🔥) ▷ #general (2 messages):

Modverse #50, Custom Server Tag


Modular (Mojo 🔥) ▷ #mojo (10 messages🔥):

kgen and pop documentation, MLIR dialects, pop.union alignment bug, Github issue 5202


Modular (Mojo 🔥) ▷ #max (7 messages):

TextGenerationPipeline 'execute' method, Custom inference loops for retrieving logits, Language allocators and OOM handling


LlamaIndex ▷ #blog (2 messages):

Enterprise document AI, vibe-llama


LlamaIndex ▷ #general (13 messages🔥):

HuggingFace CrossEncoder Duplication, Agent creation project, AI Safety Survey


Manus.im Discord ▷ #general (13 messages🔥):

Credits Purchase, Tickets Issues, Contest Rigging Accusations, Free Daily Credits, Referral Credits


tinygrad (George Hotz) ▷ #general (7 messages):

Overworld const folding, View(const) refactor, UPat cvar and UPat.const_like redefinition, RANGEIFY=1 Impact, base removal


Nomic.ai (GPT4All) ▷ #general (3 messages):

GPT4ALL Enterprise vs Free, Model Selection for LocalDocs