Frozen AI News archive

lots of little things happened this week

**Anthropic** introduced a novel 'think' tool enhancing instruction adherence and multi-step problem solving in agents, with combined reasoning and tool use demonstrated by **Claude**. **NVIDIA**'s **Llama-3.3-Nemotron-Super-49B-v1** ranked #14 on LMArena, noted for strong math reasoning and a 15M post-training dataset. **Sakana AI** launched a Sudoku-based reasoning benchmark to advance AI problem-solving capabilities. **Meta AI** released **SWEET-RL**, a reinforcement learning algorithm improving long-horizon multi-turn tasks by 6%, and introduced **CollaborativeAgentBench**, a benchmark for collaborative LLM agents working with humans on programming and design tasks. **Percy Liang** relaunched the **HELM** benchmark with 5 challenging datasets evaluating 22 top language models.

Canonical issue URL

AI News for 3/20/2025-3/21/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (227 channels, and 3009 messages) for you. Estimated reading time saved (at 200wpm): 318 minutes. You can now tag @smol_ai for AINews discussions!

all this and more in the Twitter/Reddit/Discord recaps. We hope to ship the weekly AINews this weekend.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Models and Benchmarks

Language Model Development and Releases

AI Applications and Tools

AI Community and Events

Optimization and Training

Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. SpatialLM: LLM for 3D Scene Understanding

Theme 2. Qwen 3: Modular AI Model Developments

Theme 3. Docker's Competitive Leap: LLM in Containers

Theme 4. Gemma 3, Mistral 24B, and QwQ 32B: Performance Comparison

Theme 5. ByteDance's InfiniteYou: Identity-Preserving Image Model

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. 5 Second Flux Innovation: Nunchaku, InfiniteYou, and Step-Video-TI2V

Theme 2. Text-to-Video AI Advancements: From Open-Source Initiatives

Theme 3. Critique of LLM Evaluation Methods: Simplification & Blame

Theme 4. AI-Generated Satire and Historical Reconstructions

Theme 5. AI Art and Workflow Transparency Debates


AI Discord Recap

A summary of Summaries of Summaries by o1-2024-12-17

Theme 1. Pricing Showdowns and Censorship Woes

Theme 2. Model Upgrades and Debates

Theme 3. Fine-Tuning Adventures and VRAM Tussles

Theme 4. New Tools, Agents, and RAG

Theme 5. Tokenizer Tricks, Synthetic Data, and Hardware Upgrades


PART 1: High level Discord summaries

Cursor Community Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


LM Studio Discord


aider (Paul Gauthier) Discord


Perplexity AI Discord


Interconnects (Nathan Lambert) Discord


LMArena Discord


Notebook LM Discord


Nous Research AI Discord


HuggingFace Discord


MCP (Glama) Discord


OpenRouter (Alex Atallah) Discord


GPU MODE Discord


Nomic.ai (GPT4All) Discord


Yannick Kilcher Discord


LlamaIndex Discord


Cohere Discord


Modular (Mojo 🔥) Discord


DSPy Discord


tinygrad (George Hotz) Discord


Torchtune Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Cursor Community ▷ #general (789 messages🔥🔥🔥):

Cursor pricing, Claude 3.7, Vibe coding, Pear AI vs Cursor, React vs Svelte

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (241 messages🔥🔥):

Gemma 3 issues, Llama failing in Gemma environment, Vision fine-tuning on Gemma 3, QLoRA for Gemma 3, Synthetic data generation

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

Unsloth Submissions, Tiny-grad Spreadsheet for tasks, Github issues with high involvement


Unsloth AI (Daniel Han) ▷ #help (95 messages🔥🔥):

DPO Trainer Upgrade, Zephyr DPO Notebook Confusion, Gemma 3 (27b) inference issue, Unsloth save and push to hub during training, Unsloth finetuning voice models

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (7 messages):

LLM chatbot, Personality bots

Link mentioned: Vite + React: no description found


Unsloth AI (Daniel Han) ▷ #research (8 messages🔥):

Foundation Model Training, Tree-of-Thought, Monte Carlo Tree Search


OpenAI ▷ #ai-discussions (296 messages🔥🔥):

OpenAI Pricing, o1 Model Architecture, Grok Deep Research, Perplexity desktop app

Link mentioned: AI Model & API Providers Analysis | Artificial Analysis: Comparison and analysis of AI models and API hosting providers. Independent benchmarks across key performance metrics including quality, price, output speed & latency.


OpenAI ▷ #gpt-4-discussions (7 messages):

GPT Pro, Subscription Issues, OpenAI Support


OpenAI ▷ #prompt-engineering (22 messages🔥):

Model Personalization, Strucutred output effect on reasoning, GPT memory usage, Github Copilot Pull Request Descriptions


OpenAI ▷ #api-discussions (22 messages🔥):

Prompt Engineering Adaptability, Model Guessing and Bias, Model Memory and Personalization, Structured Output and Reasoning, Github Copilot PR Optimization

Create a pull request body description that:
- Always begins with: "This pull request introduces..."
- Includes the following sections: **Additions**, **Fixes**, **Refactors**, and **Deletions** where possible.
- Avoids any references to commit messages, links, or minor changes (such as TypeScript interface tweaks).
- Provides a short, bullet-point summary for each section.
- Maintains the same uniform, consistent structure.

LM Studio ▷ #general (103 messages🔥🔥):

LM Studio Server API for RAG, ZeroGPU Pro Upgrade Issues, Browser Extensions for LM Studio, Audio Model Training with PyTorch, Speculative Decoding Crashes


LM Studio ▷ #hardware-discussion (136 messages🔥🔥):

RX 9070, Vulkan performance degradation, ROCm support, Gemma3 memory allocation

Link mentioned: Rtx 2080ti GIF - Rtx 2080ti - Discover & Share GIFs: Click to view the GIF


aider (Paul Gauthier) ▷ #general (207 messages🔥🔥):

Claude Code vs Aider Web Search, Aider's --no-verify Flag, o1-pro API experiences, Aider install for all users Ubuntu

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (14 messages🔥):

Aider failing tests, Aider documentation, Aider help command, aider.el package

Links mentioned:


Perplexity AI ▷ #general (204 messages🔥🔥):

Deep Research Limits, GPT 4.5 Model, Switching Models, Perplexity apps, Coding AI

Links mentioned:


Perplexity AI ▷ #sharing (6 messages):

RAGs, LLM Email Reply System, NotebookLM, Deep Reasoning


Perplexity AI ▷ #pplx-api (7 messages):

API Key Spend Tracking, search_domain_filter Documentation, R1-1776 Open Source Weights, MCP issues

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (76 messages🔥🔥):

Claude Web Search, Midjourney -> Cursor, TokenSet Image Generation, Qwen3 Release, Hunyuan-T1

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (23 messages🔥):

Esoteric Total Ordering, Unitree Robotics Kip-Up, Claude uses Brave Search, Capybara Logo Change, Token Counting Inflation

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (2 messages):

Anthropic Job Application, AI Alignment, Vibe Check


Interconnects (Nathan Lambert) ▷ #cv (6 messages):

Sonnet 3.7 Benchmarks, InternVL Open Source Training Code, OpenAI operator use case


Interconnects (Nathan Lambert) ▷ #reads (16 messages🔥):

RLAIF-V for MLLM Trustworthiness, Skill-Dependent Scaling Laws, Scaling RL Compute for Reasoning, SuperBPE Tokenizer, SWEET-RL for Multi-Turn LLM Agents

Links mentioned:


Interconnects (Nathan Lambert) ▷ #policy (6 messages):

Scaling Laws for Language Models, Data Requirements for GPT-4.5


LMArena ▷ #general (108 messages🔥🔥):

p2l-router-7b-0318 Model, Claude Overrated?, Google AI Studio API, Deepseek R1, Qwen 3 Coming Soon

Links mentioned:


Notebook LM ▷ #use-cases (21 messages🔥):

Podcast Feature in NotebookLM, NotebookLM vs Gemini, Efficient PDF Processing Workflow, AI Avatar Lip Syncing Services, Mindmap Feature rollout


Notebook LM ▷ #general (74 messages🔥🔥):

Flashcard Generation, NotebookLM vs Chatbase, Premium Voice Overview Limits, Mind Map Feature Rollout, Whitelist NotebookLM Crawler


Nous Research AI ▷ #general (79 messages🔥🔥):

Nvidia Blackwell RTX Pro series, Data filtering strategies, DeepHermes 24B OOM issues, WorldSim appreciation

Links mentioned:


Nous Research AI ▷ #ask-about-llms (8 messages🔥):

Hermes 3 Llama 3.2 3B, Model Parameters, Response Generation Issues

Link mentioned: NousResearch/Hermes-3-Llama-3.2-3B-GGUF · Hugging Face: no description found


Nous Research AI ▷ #research-papers (1 messages):

teknium: https://x.com/nick11roberts/status/1902875088438833291?s=46


Nous Research AI ▷ #research-papers (1 messages):

teknium: https://x.com/nick11roberts/status/1902875088438833291?s=46


Nous Research AI ▷ #reasoning-tasks (3 messages):

Nous Hermes 2, C# Development, Anthropic LLMs


HuggingFace ▷ #general (50 messages🔥):

Hugging Face API Outage, Roblox Voice Safety Classifier, Local Models for Speed & Privacy vs Cloud Models, Merge Multiple GPU VRAM, MagicQuill Low Quality Images

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

richieghost: Today I'm learning Pytorch Frame.


HuggingFace ▷ #i-made-this (3 messages):

Ollama Gradio UI with Kokoro TTS, Little-Geeky-s-Learning-UI, Oblix AI orchestration platform, Edge-Cloud transitions

Links mentioned:


HuggingFace ▷ #computer-vision (1 messages):

.mwayne: https://blog.roboflow.com/fine-tune-sam-2-1/amp/


HuggingFace ▷ #smol-course (2 messages):

Manual Looping vs Vectorization, GSM8K Dataset, Tokenizer ChatML Format, Certifications


HuggingFace ▷ #agents-course (24 messages🔥):

HF Course Certificate, Unit 2.1 Error, AI agent for UI automation, Langfuse Error, Smolagent model to run locally

Links mentioned:


MCP (Glama) ▷ #general (69 messages🔥🔥):

mcp-mysql-server issues, fastmcp Framework, Vibe Coding, DaVinci Resolve MCP update, Glama API outage

Links mentioned:


MCP (Glama) ▷ #showcase (6 messages):

Microsoft Semantic Workbench, Turso MCP tool video, Asana MCP + Google Calendar MCP, MCPHub.nvim + Avante + Figma MCP

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (64 messages🔥🔥):

OpenRouter TTS, Ernie Models, Sambanova, Inferencenet, OpenAI audio models

Links mentioned:


GPU MODE ▷ #general (9 messages🔥):

vast.ai ncu profiling, Jake, Spam detection with neural nets


GPU MODE ▷ #triton (6 messages):

cuTile talk, atomic addition with bfloat16, triton 3.1.0 and triton-windows 3.2.0, Triton's ease of use, sparse attention pattern

Link mentioned: native-sparse-attention-pytorch/native_sparse_attention_pytorch/triton_native_sparse_attention.py at main · lucidrains/native-sparse-attention-pytorch: Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper - lucidrains/native-sparse-attention-pytorch


GPU MODE ▷ #cuda (3 messages):

FlashMLA SmemLayoutP, Pointer Tagging

Link mentioned: FlashMLA/csrc/flash_fwd_mla_kernel.h at b31bfe72a83ea205467b3271a5845440a03ed7cb · deepseek-ai/FlashMLA: FlashMLA: Efficient MLA decoding kernels. Contribute to deepseek-ai/FlashMLA development by creating an account on GitHub.


GPU MODE ▷ #torch (5 messages):

ZeRO offload, full-finetuning, 8B model, BF16, A100 40GB


GPU MODE ▷ #algorithms (2 messages):

GPU Mode Scammer, Discord Channel Alerts


GPU MODE ▷ #lecture-qa (3 messages):

Hopper Architecture, Microbenchmarking, Matrix Multiplication


GPU MODE ▷ #self-promotion (3 messages):

GTC Presentation, CUDA Kernels, Small Transformer Models, Hopper Architecture, CUTLASS 4.0


GPU MODE ▷ #reasoning-gym (4 messages):

Deprecated Coach class, Curriculum Experiments


GPU MODE ▷ #submissions (11 messages🔥):

Leaderboard Submissions, GPU Tests


GPU MODE ▷ #hardware (2 messages):

Consumer GPUs, Cloud GPUs, Local vs Cloud


Nomic.ai (GPT4All) ▷ #general (35 messages🔥):

Oblix, AI Orchestration, Local LLM for SFW Stories, LLM Leaderboards, PC Build for Medical Data

Links mentioned:


Yannick Kilcher ▷ #general (10 messages🔥):

W-GAN saturation, Transformers soft slots, MCP UX/UI

Link mentioned: OpenAI.fm: An interactive demo for developers to try the new text-to-speech model in the OpenAI API


Yannick Kilcher ▷ #paper-discussion (3 messages):

G-Retriever, Graph Question Answering, Graph RAG

Link mentioned: G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering: Given a graph with textual attributes, we enable users to `chat with their graph': that is, to ask questions about the graph using a conversational interface. In response to a user's questions...


Yannick Kilcher ▷ #ml-news (16 messages🔥):

Claude Pokemon, AI Moore's Law, Hunyuan-T1 model

Links mentioned:


LlamaIndex ▷ #blog (1 messages):

Local RAG app, GitIngest parsing, Streamlit UI, Ollama Llama 3.2


LlamaIndex ▷ #general (13 messages🔥):

LlamaIndex TypeScript Agent Import Issue, Agent Workflow Parallel Execution Limits, Human-in-the-Loop Tool Limitations

Link mentioned: [Question]: Parallel Human in Loop with Agent Workflow Issues · Issue #18220 · run-llama/llama_index: Question Validation I have searched both the documentation and discord for an answer. Question Searching and debugging a long time to find a solution. Thanks for any help! When an agent workflow is...


Cohere ▷ #「💬」general (4 messages):

Trial Key Limits, Command-A Training Data


Cohere ▷ #「🔌」api-discussions (4 messages):

Cohere API Errors, Rate Limiting, Checking Rate Limits

Link mentioned: Errors (status codes and description) — Cohere: Understand Cohere's HTTP response codes and how to handle errors in various programming languages.


Cohere ▷ #「🤖」bot-cmd (3 messages):

Bot Permissions


Cohere ▷ #「🤝」introductions (2 messages):

Introductions, Low-code tech, Community Engagement


Modular (Mojo 🔥) ▷ #mojo (12 messages🔥):

Duration Module Proposal, Mojo and PyTorch Integration, Nanosecond Precision as Base Unit


DSPy ▷ #general (1 messages):

MIPRO v2, LLM-as-a-judge, Automatic Metrics, DSPy Optimization, Evaluation Metrics

Links mentioned:


tinygrad (George Hotz) ▷ #general (1 messages):

Unet3d model, 2D Convolutions


Torchtune ▷ #papers (1 messages):

krammnic: I like this: https://arxiv.org/pdf/2502.07923






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}