Frozen AI News archive

AI Engineer Summit Day 1

The **AIE Summit** in NYC highlighted key talks including **Grace Isford's Trends Keynote**, **Neo4j/Pfizer's presentation**, and **OpenAI's first definition of Agents**. Speakers announced **$930 million in funding**. On AI Twitter, discussions focused on **Grok-3** and **o3-mini** models, with debates on performance and benchmarking, including **Grok-3's record compute scale of 4e26 to 5e26 FLOP**. The **o3-mini** model uncovered a critical **CUDA kernel bug** in Sakana AI's code. **DeepSeek-R1** was promoted as an open-source alternative with notable training batch sizes. Additionally, **Alibaba** announced the **Qwen 2.5-VL** model release.

Canonical issue URL

AI News for 2/19/2025-2/20/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (211 channels, and 6423 messages) for you. Estimated reading time saved (at 200wpm): 647 minutes. You can now tag @smol_ai for AINews discussions!

Day 1 of AIE Summit has concluded here in NYC.

If you forced us to pick only 3 talks to focus on, check out Grace Isford's Trends Keynote, Neo4j/Pfizer's presentation, and OpenAI defining Agents for the first time. $930m of funding was announced by speakers/sponsors. Multiple Anthropic datapoints went semi-viral.

image.png

You can watch back the full VOD here:

https://www.youtube.com/watch?v=L89GzWEILkM

Day 2 will focus on Agent Engineering, while Day 3 will have IRL workshops and the new Online track.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Models, Benchmarks, and Performance

Open Source and Community

Research and Development

Robotics and Embodiment

Tools and Applications


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Qwen2.5-VL-Instruct excels in visual and video tasks

Theme 2. Reverb-7b Outperforms in Open LLM Leaderboards

Theme 3. SmolVLM2: Compact models optimizing video tasks

Theme 4. Open-source AI agents tackling new frontiers

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. Multi-modal AI Systems: Bridging Text and Vision


AI Discord Recap

A summary of Summaries of Summaries by o1-preview-2024-09-12

Theme 1. Grok 3 Steals the Spotlight from OpenAI

Theme 2. Unsloth's GRPO Algorithm Slashes VRAM Requirements

Theme 3. AI CUDA Engineer's Wild Speedup Claims Raise Eyebrows

Theme 4. Microsoft's Quantum Leap with Majorana 1 Meets Skepticism

Theme 5. AI Companies Bag Big Bucks, Betting on Inference Boom


PART 1: High level Discord summaries

OpenAI Discord


Codeium (Windsurf) Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


aider (Paul Gauthier) Discord


Cursor IDE Discord


HuggingFace Discord


Perplexity AI Discord


Interconnects (Nathan Lambert) Discord


OpenRouter (Alex Atallah) Discord


Nous Research AI Discord


Yannick Kilcher Discord


GPU MODE Discord


Stability.ai (Stable Diffusion) Discord


Eleuther Discord


Notebook LM Discord


Torchtune Discord


Latent Space Discord


MCP (Glama) Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


Cohere Discord


AI21 Labs (Jamba) Discord


tinygrad (George Hotz) Discord


Nomic.ai (GPT4All) Discord


LLM Agents (Berkeley MOOC) Discord


DSPy Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

OpenAI ▷ #ai-discussions (979 messages🔥🔥🔥):

Grok 3 performance, SuperGrok subscription, Comparison with OpenAI models, Grok's capabilities, Community feedback

Links mentioned:


OpenAI ▷ #gpt-4-discussions (1 messages):

Feature Requests, Chat Tracking Methods


OpenAI ▷ #prompt-engineering (2 messages):

Software troubleshooting, Insights for improvement


OpenAI ▷ #api-discussions (2 messages):

Prompt issues, Software performance


Codeium (Windsurf) ▷ #announcements (1 messages):

DeepSeek-V3 Unlimited, Windsurf Pro and Ultimate Plans, Prompt and Flow Action Credits

Link mentioned: Tweet from Windsurf (@windsurf_ai): DeepSeek-V3 is now unlimited in Windsurf Pro and Ultimate plans.0 prompt credits. 0 flow action credits.


Codeium (Windsurf) ▷ #content (1 messages):

MCP content, Use cases for MCP, MCP in Cascade

Link mentioned: Tweet from Windsurf (@windsurf_ai): If you're still having questions about MCP and its potential use cases, here's a quick demo on how MCP can work within Cascade!


Codeium (Windsurf) ▷ #discussion (86 messages🔥🔥):

Codeium plugin in JetBrains, Supercomplete feature, Windsurf installation requirements, Comparison of Codeium and CodeBuddy, Concerns about Codeium's support

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (546 messages🔥🔥🔥):

Windsurf usability issues, DeepSeek vs Cascade Base, Memory system in Cascade, MCP server configuration, Support response inquiries

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (485 messages🔥🔥🔥):

Unsloth AI Models, GRPO Training Updates, Training Loss Issues, Distilled Model Performance, AI Community Insights

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (17 messages🔥):

Unsloth Art, Quantum Computing, Triton Language in Challenges, Cohesion Timing Hardware, Inline Assembly in Triton

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (32 messages🔥):

Installing Unsloth, RTX 5090 Mobile Specs, GPU Performance and Fine-tuning, VRAM Usage in Datasets, Qwen2.5 Model Inference Issues

Link mentioned: Fine-tuning Guide | Unsloth Documentation: Learn all the basics of fine-tuning.


Unsloth AI (Daniel Han) ▷ #showcase (4 messages):

RAG vs Fine Tuning, Video Examples, Kolo Usage, Industry Insights

Link mentioned: RAG vs. Fine Tuning (Live demo): Which is better RAG or Fine tuning? Does the industry have it wrong? Can fine tuning deliver better results than a traditional RAG system? Watch the video fo...


Unsloth AI (Daniel Han) ▷ #research (56 messages🔥🔥):

Rigor in Science, Citizen Science, AI in Medicine, Content Moderation Research, Phytochemical Formulations

Link mentioned: Title: AI-Powered Phytochemical Formulation: A Data-Driven Approach to Supporting Health: no description found


LM Studio ▷ #general (381 messages🔥🔥):

Hunyuan Image Generation Model, A100 GPU Performance, Speculative Decoding Analysis, LM Studio Features, Embedding Models for Long Texts

Links mentioned:


LM Studio ▷ #hardware-discussion (190 messages🔥🔥):

Apple Silicon Performance, ARM vs x86 Architecture, Intel's Competition in the Market, Latest AMD Ryzen AI Max+ Specs, Memory Configuration and Performance

Links mentioned:


aider (Paul Gauthier) ▷ #general (358 messages🔥🔥):

Grok 3 Performance, Aider Integration Challenges, Elon Musk's Influence on AI, DeepSeek-R1 Comparison, AI Model Cost Efficiency

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (20 messages🔥):

Model Configuration in Aider, Editor vs Architect Mode, Font Color Changes in Aider, Using Local Models, NPM Package Management

Links mentioned:


aider (Paul Gauthier) ▷ #links (2 messages):

Slow Build Process, RAG vs AI Chat Performance, Costs of Indexing


Cursor IDE ▷ #general (354 messages🔥🔥):

Cursor IDE updates, Grok 3 performance, Sonnet 3.5 issues, MCP server functionality, AI model discussions

Links mentioned:


HuggingFace ▷ #general (79 messages🔥🔥):

Hugging Face Hardcover Release, Qwen2.5 Training Improvement, Video Generators on HF Spaces, Coding Models Discussion, Spark Engine Discord Community

Links mentioned:


HuggingFace ▷ #today-im-learning (2 messages):

Quantum Computing, Majorana 1, Satya Nadella's innovations

Link mentioned: Majorana 1 - Why Quantum Computing Matters Now: Introduction: A Potential New Era of Computing Imagine a computer so powerful it could solve problems in minutes that would take today’s fastest supercomputers billions of years ...


HuggingFace ▷ #cool-finds (3 messages):

Zurich 14B Model, Hugging Face Spaces

Link mentioned: Zurich 14B - a rubenroy Collection: no description found


HuggingFace ▷ #i-made-this (10 messages🔥):

CommentRescueAI, Aster audio search app, ASR dataset for Ukrainian, docSmith documentation generator, NotAnAI.ai

Links mentioned:


HuggingFace ▷ #reading-group (1 messages):

Substack on LLMs, Code & Cognition

Link mentioned: Unlocking Lightning Fast LLMs: The Power of KV Caching: Have you ever wondered how AI chatbots respond almost instantly, despite running massive language models under the hood? The secret lies in a powerful optimization technique called KV caching.


HuggingFace ▷ #core-announcements (1 messages):

Lumina2 Fine-Tuning, LoRA Implementation

Link mentioned: diffusers/examples/dreambooth/README_lumina2.md at main · huggingface/diffusers: 🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX. - huggingface/diffusers


HuggingFace ▷ #NLP (5 messages):

Quantifying Summarization and Charts, NLP Learning Resources, Fine-tuning Chat Models, Modular Arithmetic in Coding Theory


HuggingFace ▷ #smol-course (4 messages):

HF Learn Course Implementation, New Units for Course


HuggingFace ▷ #agents-course (238 messages🔥🔥):

Unit 2.1 Publication Status, Accessing Hugging Face Models, Troubleshooting Dummy Agent Library, Introducing Team Members, Questions about Course Format

Links mentioned:


Perplexity AI ▷ #general (243 messages🔥🔥):

Perplexity AI usage issues, Grok 3 performance comparison, Deep Research functionality, O3 and O3 Mini models, API integration and capabilities

Links mentioned:


Perplexity AI ▷ #sharing (23 messages🔥):

AI Hedge Fund outperforming market, Mexico vs Google over Gulf, Bipedal muscular robots, Glowing protein creation, Neural networks analysis

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (4 messages):

Deep research API, Sonar API performance issues, Model comparison


Interconnects (Nathan Lambert) ▷ #news (156 messages🔥🔥):

PaliGemma 2 Mix Model, AI CUDA Engineer, ALLaM Arabic Model, Helix Robotics Model, Mercor AI Recruiting

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (4 messages):

Grok 3 reasoning, Difference between think and big brain, xAI vs OpenAI capabilities, Confusion over scores

Link mentioned: Tweet from wh (@nrehiew_): If the light blue part is best of N scores, this means that Grok 3 reasoning is inherently an ~o1 level model. This means the capabilities gap between OpenAI and xAI is ~9 months. Also what is the dif...


Interconnects (Nathan Lambert) ▷ #random (69 messages🔥🔥):

Nadella on Dwarkesh, AI competitions, GRPO advancements, Anthropic employee retention, Podcast appearances

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (5 messages):

Useless Machine with AI Agent, AI Research in China and Google, Claude's Situation, AIME 2025 Performance Comparison, Grok's Development

Links mentioned:


Interconnects (Nathan Lambert) ▷ #cv (1 messages):

the_real_jrb: https://arxiv.org/abs/2502.13923


Interconnects (Nathan Lambert) ▷ #reads (9 messages🔥):

Open Source AI Critique, Satya Nadella on AI, Microsoft Product Quality, Copilot Development, Microsoft Teams Integration

Links mentioned:


Interconnects (Nathan Lambert) ▷ #posts (1 messages):

SnailBot News: <@&1216534966205284433>


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

Reasoning Tokens Behavior, User Feedback on Token Responses, Proposed Changes to Reasoning Tokens, Poll on Reasoning Token Settings


OpenRouter (Alex Atallah) ▷ #app-showcase (2 messages):

Weaver Chrome Extension, Open Source API Tool

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (209 messages🔥🔥):

OpenRouter API Integration, Gemini Model Issues, DeepSeek Models Performance, API Key Generation, Vision and Reasoning Models

Links mentioned:


Nous Research AI ▷ #general (196 messages🔥🔥):

Grok3 Performance Concerns, Applications of Evolutionary Strategies in Training, Coded Datasets for AI Models, Agents Collaboration and Refinement, Equilibrium Propagation in Neural Networks

Links mentioned:


Nous Research AI ▷ #interesting-links (1 messages):

Reinforcement Learning for LLMs, Scaling Supervision

Link mentioned: Tweet from Shashwat Goel (@ShashwatGoel7): I pieced together this first-principles no RL prerequisites explainer on how RL for LLMs works, and why we need it🧵The main point? RL is exciting because it allows us to scale supervision. We can now...


Yannick Kilcher ▷ #general (75 messages🔥🔥):

Transformer Backpropagation, Logit vs Probability in Decision Making, Evolutionary Strategies for LLMs, LoRA vs Full Fine-Tuning, Reinforcement Learning for LLMs

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (73 messages🔥🔥):

DeepSeek's Sparse Attention Paper, AGI and Intelligence Models, Conditional Attention Concepts, Differential Transformers


Yannick Kilcher ▷ #ml-news (9 messages🔥):

Perplexity AI and Chinese Censorship, Microsoft Unveils Majorana 1, Topological Qubits Explained, Windows 11 Privacy Updates, Google's PaliGemma 2 Launch

Links mentioned:


GPU MODE ▷ #general (12 messages🔥):

GPU spec spreadsheet, AI CUDA Engineer, Snapdragon GPU computations, GPU architecture resources, Computer architecture books

Links mentioned:


GPU MODE ▷ #triton (1 messages):

TMA Descriptor in Triton, Persistent Kernel Implementations, Matrix Multiplication Techniques, FP8 and FP16 Support, Benchmarking Triton with cuBLAS

Link mentioned: Persistent Matmul — Triton documentation: no description found


GPU MODE ▷ #cuda (14 messages🔥):

Raw-Dogged Tensor Proposal, RTX 5080+ Triton Issues, Warp Specialization Kernels, TF32 NT Kernel Inquiry, Custom gmem Offset Math in Device Code

Link mentioned: cutlass/include/cutlass/gemm/collective/sm90_mma_tma_gmma_rs_warpspecialized.hpp at main · NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines. Contribute to NVIDIA/cutlass development by creating an account on GitHub.


GPU MODE ▷ #algorithms (1 messages):

GRPO algorithm advancements, VRAM reduction techniques, Extended context lengths, Llama 3.1 benchmarking, Gradient checkpointing

Link mentioned: Tweet from Unsloth AI (@UnslothAI): Today, we’re launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO).Using Unsloth, you can now train your own reasoning model with just 5G...


GPU MODE ▷ #cool-links (12 messages🔥):

AI CUDA Engineer, Nanotron Blog Post, HadaCore Quantization, CUDA Kernel Optimization, Quantization Techniques

Links mentioned:


GPU MODE ▷ #jobs (2 messages):

Apple ML Research, A5Labs ML Engineer Position

Links mentioned:


GPU MODE ▷ #torchao (9 messages🔥):

torchao issue, HuggingFace error, past_key_values bug, modeling_llama.py fix

Links mentioned:


GPU MODE ▷ #off-topic (1 messages):

Together Computer Series Funding


GPU MODE ▷ #irl-meetup (1 messages):

kpk1340: Anyone in NYC?


GPU MODE ▷ #rocm (9 messages🔥):

Mi50 Hardware Support, Matmul Operations, GPU Architectures

Link mentioned: 8ANET - AMD 100-506143 Radeon Instinct™ MI50 Accelerator PCIe 4.0 x16 32GB HBM2 4096-bit 3840 Stream Processors Passive Cooling : no description found


GPU MODE ▷ #liger-kernel (3 messages):

Convergence test fix, PR merging process, Native Sparse Attention


GPU MODE ▷ #self-promotion (1 messages):

iron_bound: Goat https://m.youtube.com/watch?v=leCY8vCUS4g


GPU MODE ▷ #🍿 (10 messages🔥):

AI CUDA Engineer, CUDA kernel optimization, Rewards and challenges in code generation, Research papers on CUDA, Evolutionary AI approaches

Links mentioned:


GPU MODE ▷ #edge (1 messages):

Hybrid Speech Processing Application, NVIDIA Jetson Nano, Speech Separation Model, Cloud LLM Integration


GPU MODE ▷ #reasoning-gym (76 messages🔥🔥):

Reasoning Gym Server, Spatial Reasoning Datasets, Decimal Arithmetic Enhancements, Needle in Haystack Dataset, UnslothAI's New Algorithms

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (130 messages🔥🔥):

Comparison of SD and Flux, ControlNet Applications, Custom Model Creation, Using Scribbles for Image Generation, GPU Recommendations for AI Tools

Links mentioned:


Eleuther ▷ #general (6 messages):

GPU scheduler optimization, AI CUDA Engineer, ARENA 4.0 program

Link mentioned: The AI CUDA Engineer 👷: no description found


Eleuther ▷ #research (81 messages🔥🔥):

AI CUDA Engineer, CUDA and PyTorch performance, LLM Optimization, Clockwork RNN, Model training insights

Links mentioned:


Eleuther ▷ #interpretability-general (4 messages):

Logit Lens, Tuned Lens, Transformers Analysis, Computer Security Analogy, Average-case Goals


Eleuther ▷ #lm-thunderdome (8 messages🔥):

lm-eval-harness, runtime benchmarks, model path errors, lm studio, task path errors


Eleuther ▷ #gpt-neox-dev (12 messages🔥):

Evo2 Genome Models, Llama 3.2 Comparison, NCCL_BUFFSIZE Adjustments

Links mentioned:


Notebook LM ▷ #use-cases (12 messages🔥):

Podcast TTS Issues, Inviting Non-Google Users to Notebooks, Tesla Autonomous Driving Patent Insights, Using NotebookLM for Homeschooling, AI's Understanding of Literary Works


Notebook LM ▷ #general (97 messages🔥🔥):

NotebookLM Permissions, Audio Features, Notebook Sharing Issues, Source Limitations, User Experience on NotebookLM

Links mentioned:


Torchtune ▷ #announcements (1 messages):

Torchtune Roadmap, PyTorch Roadmaps

Links mentioned:


Torchtune ▷ #general (43 messages🔥):

VRAM requirements with packing, Roadmap updates, Emerging attention techniques, Pruning strategies for LLMs, Exotic transformer architectures

Links mentioned:


Torchtune ▷ #dev (15 messages🔥):

Judge Framework for Online DPO, AdamWScheduleFree as Default Optimizer, Pruning & Checkpointing Utilities, Integration of Torchtune with Gymnasium, Intercode for LLMs

Links mentioned:


Torchtune ▷ #papers (4 messages):

Multi-step PPO, Tool Learning, Reward Shaping, StepTool Framework, UltraScale Playbook

Links mentioned:


Latent Space ▷ #ai-general-chat (49 messages🔥):

Baseten Series C funding, Mastra JS agent framework, Arize AI Series C funding, Lambda $480M Series D, OpenAI's growing user base

Links mentioned:


MCP (Glama) ▷ #general (11 messages🔥):

SSE implementation, Debugging Glama hosted models, Puppeteer installation issues, Docker requirements, Remote MCP feature timeline


MCP (Glama) ▷ #showcase (26 messages🔥):

Dockerized MCP Servers, Sage support for LLM Providers, Glama Integration, MCP Python Interpreter, Roots in MCP Clients

Links mentioned:


Modular (Mojo 🔥) ▷ #general (3 messages):

MAX 25.1 Livestream, Community Meeting Talks, Modular Branded Merchandise

Link mentioned: Modular Community Q&A: no description found


Modular (Mojo 🔥) ▷ #mojo (33 messages🔥):

Native Mojo Windows Support, Slab List Structure Discussion, Comparing Mojo and Python, AI Compute Performance, Low-Level Programming in Mojo

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

LlamaCloud EU, LlamaParse upgrades


LlamaIndex ▷ #general (21 messages🔥):

Agent Workflows in the Loop, Handling Multiple Tool Calls, Redis Parallel Processing Best Practices, LlamaCloud System Outage, Blockchain Developments

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

Next phase of AI, Data operation trends


Cohere ▷ #discussions (2 messages):

Channel Creation Request, Color Change Announcement


Cohere ▷ #cmd-r-bot (3 messages):

Profit-sharing opportunities, Impact of a world without coffee


Cohere ▷ #projects (13 messages🔥):

Identity Sharing in Collaboration, Concerns about Personal Information, Communication Clarity in Forums


AI21 Labs (Jamba) ▷ #general-chat (16 messages🔥):

Jamba API usage, PHP integration with Jamba, Response formatting issues, Removing special characters, Using AJAX for API calls

Links mentioned:


tinygrad (George Hotz) ▷ #general (6 messages):

Model Performance on Different Hardware, Int8 Quantization Issues, Testing Speed in Torch vs Tinygrad, Optimizations with BEAM, New PyTorch Channel


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

Operations in tinygrad, Documentation for BLOCK operations, Codebase search strategies

Link mentioned: tinygrad/tinygrad/codegen/linearize.py at master · tinygrad/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ❤️ - tinygrad/tinygrad


Nomic.ai (GPT4All) ▷ #general (10 messages🔥):

System Message Terminology, Model Instructions, Image Pasting Capability, Nomic Implementation Questions


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (4 messages):

2024 LLM Agents Course, Quiz Archive Access, DSPy Interest, Lecture Availability

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-readings-discussion (3 messages):

Quiz Access, MOOC Resources

Links mentioned:


DSPy ▷ #show-and-tell (1 messages):

HaizeLabs Judge Compute, Qwen/Qwen2.5-VL-7B-Instruct, LLM-AggreFact scores

Link mentioned: LLM-AggreFact_DSPy: GitHub Gist: instantly share code, notes, and snippets.


DSPy ▷ #general (5 messages):

Judge-Time Scaling, Personal Voice Identity Manager, DSPy Conversation History, Message Template Exporting

Link mentioned: Tweet from Leonard Tang (@leonardtang_): First came pre-training scaling; then came inference-time scaling.Now comes judge-time scaling.Despite progress in AI through scaled inference-time compute, AI remains unreliable in open-ended, non-ve...



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}