Frozen AI News archive

not much happened today

**Grok-3**, a new family of LLMs from **xAI** using **200,000 Nvidia H100 GPUs** for advanced reasoning, outperforms models from **Google, Anthropic, and OpenAI** on math, science, and coding benchmarks. **DeepSeek-R1** from **ByteDance Research** achieves top accuracy on the challenging **SuperGPQA** dataset. **SigLIP 2** from **GoogleDeepMind** improves semantic understanding and OCR with flexible resolutions and multilingual capabilities, available on HuggingFace. **OpenAI's o3-mini-high** ranks #1 in coding and math prompts. **Perplexity's R1 1776**, a post-trained version of DeepSeek R1, is available on Ollama. The **Llamba** family distills **Llama-3.x** into efficient recurrent models with higher throughput. **AlphaMaze** combines DeepSeek R1 with GRPO for visual reasoning on ARC-AGI puzzles. **Audiobox Aesthetics** from **Meta AI** offers unified quality assessment for audio. The community notes that Grok 3's compute increase yields only modest performance gains.

Canonical issue URL

AI News for 2/20/2025-2/21/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (212 channels, and 6493 messages) for you. Estimated reading time saved (at 200wpm): 663 minutes. You can now tag @smol_ai for AINews discussions!

You can catch up on Day 2 of the AI Engineer Summit now.

https://www.youtube.com/watch?v=D7BzTxVVMuw


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Models and Benchmarks, highlighting model releases, performance metrics, and comparisons

Open Source and Community, focusing on open releases, community engagement, and developer tools

Hardware and Infrastructure, covering GPUs, compute, and optimization efforts

Research and Techniques, covering new methodologies, algorithms, and theoretical discussions

Applications and Products, highlighting AI product announcements and use cases

Memes and Humor, light-hearted or funny tweets related to AI


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek's Bold Move to Open-Source 5 Repos

Theme 2. Langchain's Enduring Complexity and Workflow Challenges

Theme 3. Experimenting Spatial Reasoning in LLMs with GRPO

Theme 4. Head-to-Head: Deepseek R1 vs. Grok 3 Performance

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. Grok 3 and ChatGPT Face Off: Coding Prowess and Censorship Debates

Theme 2. Cursor IDE's 0.46 Update: Stability Questioned, Claude Outputs Shift

Theme 3. Unsloth AI: VRAM Crushing GRPO and Accuracy Audits

Theme 4. Hugging Face: Spark Engine Ignites, Gradio Sketches No-Code

Theme 5. OpenRouter and Perplexity Face API and Performance Heat


PART 1: High level Discord summaries

OpenAI Discord


Cursor IDE Discord


Unsloth AI (Daniel Han) Discord


Codeium (Windsurf) Discord


HuggingFace Discord


Perplexity AI Discord


OpenRouter (Alex Atallah) Discord


Stability.ai (Stable Diffusion) Discord


aider (Paul Gauthier) Discord


Nous Research AI Discord


GPU MODE Discord


Interconnects (Nathan Lambert) Discord


Yannick Kilcher Discord


Eleuther Discord


MCP (Glama) Discord


Modular (Mojo šŸ”„) Discord


Notebook LM Discord


Latent Space Discord


Torchtune Discord


LlamaIndex Discord


Nomic.ai (GPT4All) Discord


tinygrad (George Hotz) Discord


Cohere Discord


DSPy Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

OpenAI ā–· #annnouncements (1 messages):

Operator Rollout, Regional Availability


OpenAI ā–· #ai-discussions (921 messagesšŸ”„šŸ”„šŸ”„):

Grok 3 vs ChatGPT Plus, Translation features in AI, AI capabilities in programming, User experiences with various AI models, Deepseek and its alternatives

Links mentioned:


OpenAI ā–· #gpt-4-discussions (61 messagesšŸ”„šŸ”„):

OpenAI Teams and Operator, Coding capabilities comparison, Community feedback on feature requests, User experiences with Teams, Moderation and community interaction


OpenAI ā–· #prompt-engineering (7 messages):

Prompt Evaluation, GPT Builder Reasoning Prompts, English Grammar Improvement, CS Communication Challenges


OpenAI ā–· #api-discussions (7 messages):

GPT Builder Prompt Effectiveness, English Grammar Learning Prompts, Feedback on Outputs, Improving Prompt Design


Cursor IDE ā–· #general (908 messagesšŸ”„šŸ”„šŸ”„):

Cursor 0.46 Release, Claude Model Performance, API Integration Issues, User Experience with MCP, Feedback on AI Tooling

Links mentioned:


Unsloth AI (Daniel Han) ā–· #general (444 messagesšŸ”„šŸ”„šŸ”„):

Unsloth AI updates, Multi-GPU Training, Qwen Model Fine-Tuning, GPU Comparisons, Reward Functions in AI Models

Links mentioned:


Unsloth AI (Daniel Han) ā–· #off-topic (6 messages):

Triton Inline Assembly, Dequantization Results, Performance Discrepancies

Link mentioned: triton.language.inline_asm_elementwise — Triton documentation: no description found


Unsloth AI (Daniel Han) ā–· #help (42 messagesšŸ”„):

Qwen 2.5 VL Performance, Using LoRA with vLLM, GRPO Training Confusion, Multi-GPU Training Support, Fine-tuning Script Models

Links mentioned:


Unsloth AI (Daniel Han) ā–· #showcase (4 messages):

RAG chunking issues, Spark Engine release, LLM Spatial Reasoning testing

Links mentioned:


Unsloth AI (Daniel Han) ā–· #research (214 messagesšŸ”„šŸ”„):

AI in Medical Diagnostics, Clinical Trials and Ethics, Psychological Diagnosis, Potential AI Enhancements, Use of LLMs in Specialized Fields

Links mentioned:


Codeium (Windsurf) ā–· #discussion (32 messagesšŸ”„):

Codeium Features, JupyterLab Extension Issues, IntelliJ Autocompletion Problems, Windsurf IDE Expectations, Codeium Support and Feedback Channels

Links mentioned:


Codeium (Windsurf) ā–· #windsurf (294 messagesšŸ”„šŸ”„):

Codeium support issues, Windsurf features and bugs, Using Cascade with MCP, User experience with Windsurf, Feature requests for Windsurf

Links mentioned:


HuggingFace ā–· #general (94 messagesšŸ”„šŸ”„):

HuggingFace Chatbot Preferences, Training Models & Performance Issues, Audio Generation Models, COLD Dataset Insights, SmolAgents Exploration

Links mentioned:


HuggingFace ā–· #today-im-learning (2 messages):

Tensor Parallelism, Neuralink Advancements


HuggingFace ā–· #cool-finds (2 messages):

Universal Transformers Dataset, MassiveDS-140B Release, Data Access and Community Engagement, Data Protection Measures

Links mentioned:


HuggingFace ā–· #i-made-this (5 messages):

Spark Engine Release, Cyclic KL Beta Manager, HF Model Testing

Links mentioned:


HuggingFace ā–· #reading-group (4 messages):

Parameter size vs training data, Channel posting etiquette


HuggingFace ā–· #computer-vision (1 messages):

OCR lightweight models, InternVL model errors


HuggingFace ā–· #NLP (10 messagesšŸ”„):

HuggingFace NLP Course, Finetuning Models, Modular Arithmetic in Coding Theory, Recommended NLP Books


HuggingFace ā–· #gradio-announcements (1 messages):

Gradio Sketch, No-code Gradio apps, Gradio app deployment, Terminal commands for Gradio


HuggingFace ā–· #smol-course (9 messagesšŸ”„):

Running Agents with Smolagents, Understanding Space Duplication, HF Token for Agent Functions, DevOps Relevance to the Course, Certification Space PR for Module 1

Links mentioned:


HuggingFace ā–· #agents-course (125 messagesšŸ”„šŸ”„):

Participant Introductions, Token Access and Model Usage, Course Feedback, Technical Issues with Installation, Study Buddy Requests

Links mentioned:


Perplexity AI ā–· #general (221 messagesšŸ”„šŸ”„):

Perplexity Pro performance issues, Deep Research citation problems, R1 model functionality, AI model comparison, Learning Python and AI/ML

Links mentioned:


Perplexity AI ā–· #sharing (15 messagesšŸ”„):

Vim Exiting Instructions, 8085 Simulator Implementation, Taiwan's Independence Debate, Phyt Intelligence, New iPhone 17 Design

Link mentioned: YouTube: no description found


Perplexity AI ā–· #pplx-api (8 messagesšŸ”„):

Deep Research API, Sonar vs Llama Models, Model Configuration Changes, Image Transmission via API

Link mentioned: no title found: no description found


OpenRouter (Alex Atallah) ā–· #app-showcase (2 messages):

Weaver Demo, Chrome Extension by Amir

Links mentioned:


OpenRouter (Alex Atallah) ā–· #general (235 messagesšŸ”„šŸ”„):

Model Access and Features, OpenRouter Documentation, DeepSeek Model Performance, API Usage and Integration, Reverse Engineering Concerns

Link mentioned: OpenRouter: A unified interface for LLMs. Find the best models & prices for your prompts


Stability.ai (Stable Diffusion) ā–· #general-chat (231 messagesšŸ”„šŸ”„):

Hiring Stable Diffusion Expert, Flux and SD Models Discussion, Stability Matrix Configuration, Using Civitai for Models, Image Generation Challenges

Links mentioned:


aider (Paul Gauthier) ā–· #general (179 messagesšŸ”„šŸ”„):

Grok 3 Performance, DeepSeek Capabilities, AI Pricing Models, Transformers vs. LSTMs, Tech CEO Opinions

Links mentioned:


aider (Paul Gauthier) ā–· #questions-and-tips (38 messagesšŸ”„):

Switching between editor and architect modes, Managing repositories with Aider, Using Aider with ignored files, Differences between architect and code modes, Updating files in real-time within Aider

Links mentioned:


aider (Paul Gauthier) ā–· #links (3 messages):

AI Assisted Coding, LLMs Productivity, Benchmarking Models


Nous Research AI ā–· #general (184 messagesšŸ”„šŸ”„):

Grok Bars Interpretation, VRAM Requirements for Fine-Tuning, MiniCPM-o 2.6 Release, AI-Assisted Coding Education, Self-Improving AI Agents

Links mentioned:


Nous Research AI ā–· #ask-about-llms (4 messages):

Cursor vs Groq performance, Training Small Reasoning Model, Model size impacts, Recent AI research paper

Link mentioned: qwen2.5-3B-openmath-grpo: Explore and run machine learning code with Kaggle Notebooks | Using data from No attached data sources


Nous Research AI ā–· #research-papers (6 messages):

Equilibrium Propagation, Vector Field Dynamics, Recurrent Backpropagation, Arcee-Maestro-7B, AlphaMaze Visual Reasoning

Links mentioned:


Nous Research AI ā–· #interesting-links (3 messages):

Reinforcement Learning for LLMs, SmolVLM2 updates, Small video models

Links mentioned:


Nous Research AI ā–· #research-papers (6 messages):

Equilibrium Propagation, Generalization of Equilibrium Propagation, Recurrent Backpropagation, Arcee-Maestro-7B-Preview, AlphaMaze

Links mentioned:


GPU MODE ā–· #general (22 messagesšŸ”„):

ROCm support for 6700XT, Discussion on AI paper response, Chinese language channel interest, Open Sourcing from DeepSeek AI, GPU cluster utilization issues

Links mentioned:


GPU MODE ā–· #triton (5 messages):

TMA Descriptor in Triton, GridQuant Example, BLOCK_SIZE and num_warps Interaction, Thread Block Size Clarification

Links mentioned:


GPU MODE ā–· #cuda (6 messages):

GEMM Kernels in CUTLASS, Memory Configuration in CUTLASS, TF32 NT Kernel Development, 1D Block Scaling in cuBLAS

Link mentioned: cutlass/include/cutlass/gemm/collective/sm90_mma_tma_gmma_rs_warpspecialized.hpp at main Ā· NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines. Contribute to NVIDIA/cutlass development by creating an account on GitHub.


GPU MODE ā–· #torch (3 messages):

Parameter dtype casting in PyTorch, Dynamic quantization in attention mechanisms, INT8 vs FP8 weight behavior, Overriding .to method in PyTorch


GPU MODE ā–· #announcements (1 messages):

GPU MODE Meetup in San Jose, SemiAnalysis Blackwell Hackathon, Beyond CUDA Summit, CUDA Developer Talks

Links mentioned:


GPU MODE ā–· #algorithms (1 messages):

GRPO VRAM Reduction, Extended Context Lengths, Gradient Checkpointing, Linear Cross Entropy Implementation

Link mentioned: Tweet from Unsloth AI (@UnslothAI): Today, we’re launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO).Using Unsloth, you can now train your own reasoning model with just 5G...


GPU MODE ā–· #cool-links (13 messagesšŸ”„):

Nanotron by Hugging Face, Hopper GPU Architecture, HadaCore and Hadamard Transforms, MLGym Framework, CuAsmRL for GPU Scheduling

Links mentioned:


GPU MODE ā–· #jobs (4 messages):

A5Labs ML Engineer hiring, Cohere Low Level Performance Engineers, Nebius DevRel Advocate position, Beam Platform and Infrastructure Engineers

Links mentioned:


GPU MODE ā–· #beginner (7 messages):

Fine-tuning LLM with DDP, Single GPU vs. DDP with 2 GPUs


GPU MODE ā–· #pmpp-book (1 messages):

Code Evaluation, Grok3 Performance, C++ Memory Semantics

Link mentioned: CUDA Memory Ordering and Synchronization | Shared Grok Conversation: if (threadIdx.x == 0) { while(AtomicAdd(&flags[bid], 0) == 0) {} // <?> why do I not need thread fen


GPU MODE ā–· #off-topic (2 messages):

Unsloth.AI, Building PC for Llama 70B

Link mentioned: Start Up Wednesday with Unsloth.AI: Meet Daniel and Michael Han, the Australian brothers transforming AI development with Unsloth. Their open-source project makes model fine-tuning 2x faster wh...


GPU MODE ā–· #rocm (1 messages):

lynn4400: what do you guys use to see profiler results ?


GPU MODE ā–· #liger-kernel (1 messages):

Native Sparse Attention, Collaboration in liger


GPU MODE ā–· #self-promotion (9 messagesšŸ”„):

Open Source GPU Glossary, NUMA and CPU-GPU memory interactions, Hopper+ materials, Hardware architecture terms, GPU interface differences

Links mentioned:


GPU MODE ā–· #šŸæ (4 messages):

The Fish in Codebases, MLGym Framework, TritonBench for Language Models, Sakana AI CUDA Engineer Story, Hardmaru's Optimizer Humor

Links mentioned:


GPU MODE ā–· #edge (7 messages):

3D Printing Parameters, Real-Time Translation with Audio Models, Whisper Model Functions


GPU MODE ā–· #reasoning-gym (95 messagesšŸ”„šŸ”„):

Code I/O dataset, Math reasoning tasks, Decimal number comparison, Model performance, API hosting for free

Links mentioned:


GPU MODE ā–· #gpuęØ”å¼ (3 messages):

Triton content, Zhihu discussions


Interconnects (Nathan Lambert) ā–· #news (94 messagesšŸ”„šŸ”„):

OpenAI's Projected Revenue and Infrastructure Spending, Emerging AI Models and Reasoning Innovations, Open Source Developments in AI, Market Dynamics in AI Infrastructure, AI Agents Revenue Disparity

Links mentioned:


Interconnects (Nathan Lambert) ā–· #ml-questions (12 messagesšŸ”„):

Benchmarking models, Parsing PDFs, O1 Pro limitations, Scraping benchmarks for reference, ODR model limitations


Interconnects (Nathan Lambert) ā–· #ml-drama (3 messages):

Sakana Leaderboard Update, Microsoft Quantum Computing Claims, Speedup on Conv3d Tasks

Links mentioned:


Interconnects (Nathan Lambert) ā–· #random (51 messagesšŸ”„):

Licenses Confirmation for Qwen, Granite Vision Model Release, Anthropic AI's Employee Retention, GRPO Training for LLMs, Online Test for Hunyuan Video Model

Links mentioned:


Interconnects (Nathan Lambert) ā–· #memes (3 messages):

AIME 2025 Performance, Grok Models, Microsoft's New State of Matter

Links mentioned:


Interconnects (Nathan Lambert) ā–· #cv (3 messages):

SigLIP 2, Multilingual Vision-Language Encoders

Link mentioned: Tweet from AK (@_akhaliq): Google just dropped SigLIP 2Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features


Interconnects (Nathan Lambert) ā–· #reads (1 messages):

GPU Glossary, Charles Frye's Talk, UCSC Event, LLM Development, Streaming Multiprocessor Architecture

Links mentioned:


Yannick Kilcher ā–· #general (47 messagesšŸ”„):

Logits vs Probabilities, Normalization in Training, Diffusion Models, LoRA Limitations, Unreliable Training Data

Links mentioned:


Yannick Kilcher ā–· #paper-discussion (62 messagesšŸ”„šŸ”„):

DeepSeek Research, Paper Presentation Series, Sparsity in Models, Conditional Attention vs. Sparse Attention, Direct Policy Optimization (DPO)

Link mentioned: Manuscript | Arc Institute: Arc Institute is a independent nonprofit research organization headquartered in Palo Alto, California.


Yannick Kilcher ā–· #ml-news (14 messagesšŸ”„):

Helix Model Introduction, Open Source Week by DeepSeek, Unsloth.AI's AI Development, Chinese VideoGen Market Dominance, Reinforcement Learning

Links mentioned:


Eleuther ā–· #general (14 messagesšŸ”„):

AI CUDA Engineer, ChatGPT System Prompt Leak, CoT Summarizer Performance, Locale and Model Behavior, ClosedAI Model Limitations

Links mentioned:


Eleuther ā–· #research (66 messagesšŸ”„šŸ”„):

Sakana project issues, COLM reputation, Funding and resources for EleutherAI

Links mentioned:


Eleuther ā–· #scaling-laws (2 messages):

Scaling Rules Paper, Training Compute-Optimal Large Language Models, Model Convergence


Eleuther ā–· #lm-thunderdome (4 messages):

Model Path Errors, Private Help Requests, OpenAI and Local-Chat Completions


Eleuther ā–· #gpt-neox-dev (22 messagesšŸ”„):

NCCL_BUFFSIZE adjustments, BF16 mixed-precision training, NeoX definition of a step, Checkpoint divergence handling, Gradient accumulation in FP32


MCP (Glama) ā–· #general (68 messagesšŸ”„šŸ”„):

MCP Server Setup, Using Custom Context in MCP, Automating Testing Lifecycle with MCP, MCP Integration with Cursor and LibreChat, MCP and Discord Interactions

Links mentioned:


MCP (Glama) ā–· #showcase (7 messages):

Sage release, MCP.run vendor lock-in, MCP-server and client capabilities, AI Discord bot integration, Music playback in Discord

Link mentioned: Wolf Of Wall Street Lets Goo GIF - Wolf Of Wall Street Lets Goo - Discover & Share GIFs: Click to view the GIF


Modular (Mojo šŸ”„) ā–· #general (1 messages):

eggsquad: <@1275513561199935621> The Modular branded Patagonia sweater goes hard


Modular (Mojo šŸ”„) ā–· #mojo (42 messagesšŸ”„):

Mojo Windows Support, Mojo vs. Python, GPU Performance in Mojo, Concurrency Handling in Mojo, Debugging Mojo Programs

Links mentioned:


Notebook LM ā–· #use-cases (8 messagesšŸ”„):

AI's Limitations in Creative Writing, Concerns About Trusting AI, Using NotebookLM for Writing Assistance, Guides for Effective NotebookLM Usage, Character and Word Limits in NotebookLM Plus

Link mentioned: no title found: no description found


Notebook LM ā–· #general (25 messagesšŸ”„):

Organizing Notebooks, Usage of Audio Deep Dive, NotebookLM iOS App, Quality of NotebookLM Answers, Sharing Limitations of NotebookLM

Links mentioned:


Latent Space ā–· #ai-general-chat (26 messagesšŸ”„):

Arize's Series C Funding, OpenAI's User Growth, Deep Seek Launchweek, Facebook's Reasoning Dataset, NEO Gamma Robotics

Links mentioned:


Latent Space ā–· #ai-announcements (1 messages):

swyxio: https://x.com/aiDotEngineer/status/1892934641067360444


Torchtune ā–· #general (14 messagesšŸ”„):

pytest errors, Installing dependencies, Test artifacts, Community support


Torchtune ā–· #dev (13 messagesšŸ”„):

MLGym Launch, GRPO Optimizations, Width/Depth Pruning Discussion, GRPO PR Engagement, Assistant Opportunities in GRPO

Links mentioned:


LlamaIndex ā–· #blog (2 messages):

LlamaParse upgrades, AI infrastructure talks, Document parsing modes, Training and fine-tuning applications


LlamaIndex ā–· #general (18 messagesšŸ”„):

Multi-Agent Handoff Issues, Creating AI from Documents, Visual Workflow Interfaces

Links mentioned:


Nomic.ai (GPT4All) ā–· #general (20 messagesšŸ”„):

NOMIC implementation, GPT4All local setup, Document querying issues, Model settings and tuning, Chat template extraction

Link mentioned: QuantFactory/NeuralDaredevil-8B-abliterated-GGUF Ā· Hugging Face: no description found


tinygrad (George Hotz) ā–· #general (16 messagesšŸ”„):

Performance Optimization in Testing, GROUP OptOps on CPUs and GPUs, Agentic CUDA Kernel Search, LLVM vs. CLANG Performance Differences, Concatenation Bounty Challenges

Link mentioned: [Bounty] Made TestSpeed.test_sum yellow on Macs with LLVM by josephsweeney Ā· Pull Request #9190 Ā· tinygrad/tinygrad: To make this happen, I enabled GROUP OptOps&#39;s on devices without local variables (CLANG and LLVM), by just adding an extra reduce instead on emitting locals. The other necessary changes came d...


tinygrad (George Hotz) ā–· #learn-tinygrad (3 messages):

tinygrad linearizer, codebase searching

Link mentioned: tinygrad/tinygrad/codegen/linearize.py at master Ā· tinygrad/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ā¤ļø - tinygrad/tinygrad


Cohere ā–· #discussions (4 messages):

Cohere Embedding Models, Channel Colors, Benchmark Leaderboards

Link mentioned: EvalAI: Evaluating state of the art in AI: EvalAI is an open-source web platform for organizing and participating in challenges to push the state of the art on AI tasks.


Cohere ā–· #cmd-r-bot (2 messages):

Half Rest Techniques


DSPy ā–· #general (1 messages):

DSPy Customization, Chat History Integration, Performance Improvements

Link mentioned: Feature request: Allow specifying chat history for LMs Ā· Issue #1435 Ā· stanfordnlp/dspy: Hi DSPy developers, First of all, thanks a lot for this great work! Recently I've been trying to integrate DSPy into my work, but I stumbled upon the chat history specification. My task is to desi...





{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}