Frozen AI News archive

not much happened today

**Claude 3.7 Sonnet** demonstrates exceptional coding and reasoning capabilities, outperforming models like **DeepSeek R1**, **O3-mini**, and **GPT-4o** on benchmarks such as **SciCode** and **LiveCodeBench**. It is available on platforms including **Perplexity Pro**, **Anthropic**, **Amazon Bedrock**, and **Google Cloud**, with pricing at **$3/$15 per million tokens**. Key features include a **64k token thinking mode**, **200k context window**, and the **CLI-based coding assistant Claude Code**. Meanwhile, **DeepSeek** released **DeepEP**, an open-source communication library optimized for MoE model training and inference with support for **NVLink**, **RDMA**, and **FP8**. These updates highlight advancements in coding AI and efficient model training infrastructure.

Canonical issue URL

AI News for 2/24/2025-2/25/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (220 channels, and 5949 messages) for you. Estimated reading time saved (at 200wpm): 503 minutes. You can now tag @smol_ai for AINews discussions!

You should follow DeepSeek's #OpenSourceWeek, but the releases so far have not met our bar for headline story status.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Claude 3.7 Sonnet Release and Performance

DeepSeek and Qwen Model Updates and Open Source Releases

Video and Multimodal Model Developments

Tools, Libraries and Datasets

Research and Analysis

AI Industry and Market Trends

Memes and Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek's DeepEP: Enhanced MoE GPU Communication

Theme 2. Sonnet 3.7 Dominates Benchmark Testing

Theme 3. Alibaba's Wan 2.1 Video Model Open-Source Release Scheduled

Theme 4. Gemma 3 27b Release: A New Contender in AI Models

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. WAN 2.1 Released and Open Source with New Features

Theme 2. Claude 3.7 Model: Enhanced Capabilities and Accessibility

Theme 3. Claude Sonnet 3.7 Reigns Supreme: New top model in LLM benchmark

Theme 4. Advanced Voice Features and Deep Research in GPT-4o Updates


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. Claude 3.7 Sonnet Storms the AI Scene

Theme 2. DeepSeek's Deep Dive into Model Efficiency

Theme 3. Open Source Tooling and Ecosystem Growth

Theme 4. Benchmarking Battles: Models Face Real-World Tests

Theme 5. Hardware Horizons: From Brains to Silicon


PART 1: High level Discord summaries

Cursor IDE Discord


aider (Paul Gauthier) Discord


Codeium (Windsurf) Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


OpenRouter (Alex Atallah) Discord


Interconnects (Nathan Lambert) Discord


Eleuther Discord


Nous Research AI Discord


MCP (Glama) Discord


LM Studio Discord


Stability.ai (Stable Diffusion) Discord


Modular (Mojo 🔥) Discord


Notebook LM Discord


LlamaIndex Discord


Torchtune Discord


Cohere Discord


DSPy Discord


The tinygrad (George Hotz) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Cursor IDE ▷ #general (1056 messages🔥🔥🔥):

Claude 3.7 Sonnet release, Cursor IDE integration, MCPs with Claude, Comparisons of Claude 3.7 with other models (GPT-4, O3), Troubleshooting Cursor and Claude 3.7

Links mentioned:


aider (Paul Gauthier) ▷ #general (935 messages🔥🔥🔥):

Claude 3.7, Aider Benchmarks, Claude Code, Thinking Models, OpenAI vs. Anthropic

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (63 messages🔥🔥):

Architect mode configuration, O3-mini access, OpenRouter benefits, Aider Compact Command, Claude 3.7 in Aider

Links mentioned:


aider (Paul Gauthier) ▷ #links (2 messages):

Hacker News Wrapped, Kagi LLM Benchmarking Project, Claude Sonnet 3.7

Links mentioned:


Codeium (Windsurf) ▷ #discussion (15 messages🔥):

Codeium chat in Vim, Codeium Discussion channel purpose, Codeium 3.7 release


Codeium (Windsurf) ▷ #windsurf (675 messages🔥🔥🔥):

Cascade UI error, Claude 3.7 Sonnet, Model comparison, Deepseek hallucination, Windsurf Dev Comms

Links mentioned:


OpenAI ▷ #ai-discussions (611 messages🔥🔥🔥):

Grok 3, Perplexity Comet agentic browser, Claude 3.7 Sonnet, Claude Code, GPT-4.5 release

Links mentioned:


OpenAI ▷ #gpt-4-discussions (9 messages🔥):

O3 issues, Screenshot posting on Discord, Bug reporting


Unsloth AI (Daniel Han) ▷ #general (345 messages🔥🔥):

paid moderators, CUDA errors, Qwen2.5 VL 72B, Claude 3.7, DeepSeek MLA

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (1 messages):

deoxykev: New qwq https://qwenlm.github.io/blog/qwq-max-preview/


Unsloth AI (Daniel Han) ▷ #help (121 messages🔥🔥):

Unsloth on Mac, GRPO Qwen notebook issue, CUDA Out of Memory, ShareGPT Dataset format, Forcing Unload From VRAM

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Claude 3.7 Sonnet, Extended Thinking, Pricing and Availability

Link mentioned: Claude 3.7 Sonnet - API, Providers, Stats: Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. Run Claude 3.7 Sonnet with API


OpenRouter (Alex Atallah) ▷ #general (346 messages🔥🔥):

Claude 3.7 Sonnet, GCP hosting Claude 3.7 Sonnet, OpenRouter rate limits, Claude 3.5 Haiku with vision, TPUs vs GPUs for inference

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (304 messages🔥🔥):

Meta AI Expansion, Claude 3.7 Sonnet Release, Claude Code Tool, Qwen Chat Release, DeepEP

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (15 messages🔥):

Berkeley Advanced Agents MOOC, Tulu 3, RLHF Explanation, AI Startups customer base, mic firmware issues

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (1 messages):

Memes


Interconnects (Nathan Lambert) ▷ #nlp (1 messages):

0x_paws: https://x.com/srush_nlp/status/1894039989526155341?s=46&t=Y6KMaD0vAihdhw7S8bL5WQ


Interconnects (Nathan Lambert) ▷ #posts (3 messages):

GIF Posts, SnailBot Tagging

Link mentioned: New New Post GIF - New New post Post - Discover & Share GIFs: Click to view the GIF


Eleuther ▷ #general (37 messages🔥):

Brain Parallelism vs GPU, LLM Scaling, Proxy Structuring Engine

Link mentioned: The Proxy Structuring Engine: High Quality Structured Outputs at Inference Time


Eleuther ▷ #research (32 messages🔥):

Wavelet Image Coding, Walsh Functions, Multi-head Latent Attention (MLA), Native Sparse Attention (NSA), Looped/Recurrent Architectures

Links mentioned:


Eleuther ▷ #interpretability-general (9 messages🔥):

Attention Maps vs. Neuron-Based Methods, Intervening on Attention Maps, Syntax Emerging from Attention Maps


Eleuther ▷ #gpt-neox-dev (10 messages🔥):

Mixed Precision Training, BF16 Training, ZeRO Offload, Optimizer States Precision, Deepseek Adam Moments

Link mentioned: Megatron-LM/megatron/core/optimizer/optimizer_config.py at main · NVIDIA/Megatron-LM: Ongoing research training transformer models at scale - NVIDIA/Megatron-LM


Nous Research AI ▷ #general (68 messages🔥🔥):

Tool use in LLMs, Claude 3.7 Sonnet, QwQ-Max-Preview, AI alignment

Links mentioned:


Nous Research AI ▷ #ask-about-llms (6 messages):

Sonnet-3.7, Misguided Attention Eval, Overfitting

Link mentioned: GitHub - cpldcpu/MisguidedAttention: A collection of prompts to challenge the reasoning abilities of large language models in presence of misguiding information: A collection of prompts to challenge the reasoning abilities of large language models in presence of misguiding information - cpldcpu/MisguidedAttention


Nous Research AI ▷ #interesting-links (4 messages):

Qwen AI, Video Generation

Link mentioned: Qwen Chat: no description found


MCP (Glama) ▷ #general (62 messages🔥🔥):

Anthropic MCP Registry API, Claude 3.7, Haiku Tool Support, Claude Code (CC), MCP Server Recommendations

Links mentioned:


MCP (Glama) ▷ #showcase (11 messages🔥):

MetaMCP Licensing, AGPL Licensing, Enact Protocol MCP Server, Claude 3.7 Sonnet on Sage

Links mentioned:


LM Studio ▷ #general (41 messages🔥):

LM Studio Wordpress Plugins Integration, Qwen 2.5 VL GGUF, QuantBench on GitHub, Speculative Decoding in LM Studio, Deepseek R1 671b RAM requirements

Links mentioned:


LM Studio ▷ #hardware-discussion (20 messages🔥):

A770 GPU performance, M2 Max vs M4 Max Power Consumption, AIO Pump USB Header Interference


Stability.ai (Stable Diffusion) ▷ #announcements (1 messages):

Feature Request Board, Discord feedback, Prioritization of Features


Stability.ai (Stable Diffusion) ▷ #general-chat (52 messages🔥):

SD3 Ultra details request, Stability updates, Dog breed image datasets, Image generation times, Image resolutions


Modular (Mojo 🔥) ▷ #mojo (11 messages🔥):

Mojo FFI, static lib, GLFW, GLEW, Sudoku example

Links mentioned:


Modular (Mojo 🔥) ▷ #max (20 messages🔥):

Hardware Accelerated Conway's Game of Life, MAX and Pygame Integration, GPU Utilization in MAX, SIMD Implementation by Daniel Lemire, Conway's Game of Life Computer

Link mentioned: Nicolas Loizeau - GOL computer: A new (and better) version of the GOL computer is available here : https://github.com/nicolasloizeau/scalable-gol-computer


Notebook LM ▷ #use-cases (2 messages):

Ease of Use, Short Prompts


Notebook LM ▷ #general (14 messages🔥):

Gemini, NotebookLM, PDF Conversions, Language prompts, Savin/Ricoh Copier


LlamaIndex ▷ #blog (3 messages):

LlamaIndex AI Assistant, ComposIO HQ, AnthropicAI Claude Sonnet 3.7


LlamaIndex ▷ #general (5 messages):

Fusion Rerank Retriever with Elasticsearch, MultiModalVectorStoreIndex and GCSReader issue


Torchtune ▷ #dev (6 messages):

Left Truncation vs Right Truncation, StatefulDataLoader PR

Link mentioned: Add support for StatefulDataLoader by joecummings · Pull Request #2410 · pytorch/torchtune: ContextWhat is the purpose of this PR? Is it to add a new feature fix a bug update tests and/or documentation other (please add here)This PR adds support for the StatefulDataLoader class fr...


Torchtune ▷ #papers (2 messages):

DeepScaleR, Reinforcement Learning, DeepEP library, MoE

Links mentioned:


Cohere ▷ #cmd-r-bot (5 messages):

DeSci Validators, Profitability Thresholds, Asset Value Expert Account


DSPy ▷ #general (2 messages):

DSPy Assertion Migration, BestOfN Module, Refine Module, Reward Functions





{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}