Frozen AI News archive

not much happened today

**AI21 Labs launched Jamba 1.6**, touted as the **best open model for private enterprise deployment**, outperforming **Cohere, Mistral, and Llama** on benchmarks like **Arena Hard**. **Mistral AI** released a state-of-the-art **multimodal OCR model** with multilingual and structured output capabilities, available for on-prem deployment. **Alibaba Qwen** introduced **QwQ-32B**, an open-weight reasoning model with **32B parameters** and cost-effective usage, showing competitive benchmark scores. **OpenAI** released **o1** and **o3-mini** models with advanced API features including streaming and function calling. **AMD** unveiled **Instella**, open-source 3B parameter language models trained on **AMD Instinct MI300X GPUs**, competing with **Llama-3.2-3B** and others. **Alibaba** also released **Babel**, open multilingual LLMs performing comparably to **GPT-4o**. **Anthropic** launched **Claude 3.7 Sonnet**, enhancing reasoning and prompt engineering capabilities.

Canonical issue URL

AI News for 3/6/2025-3/7/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (227 channels, and 7886 messages) for you. Estimated reading time saved (at 200wpm): 777 minutes. You can now tag @smol_ai for AINews discussions!

Mistral OCR and Jamba 1.6 came close.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Model Releases and Updates

Tools and Applications

Research and Concepts

Industry and Business

Opinions and Discussions

Memes/Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. M3 Ultra as a Competitive AI Workstation

Theme 2. Hunyuan Image-to-Video Release: GPU Heavy, Performance Debates

Theme 3. QwQ-32B: Efficient Reasoning vs. R1's Verbose Accuracy

Theme 4. Jamba 1.6: New Architecture Outperforms Rivals

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. InternLM2.5: Benchmarking 100% Recall at 1M Context

Theme 2. HunyuanVideo-I2V Launch and User Comparisons with Wan

Theme 3. LTX Video 0.9.5 Model: Exploring New Video Generation Capabilities

Theme 4. ChatGPT Model Enhancements: Memory and Conversational Improvements


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. QwQ-32B Model: Alibaba's Reasoning Rival Makes Waves

Theme 2. Windsurf Wave 4: Codeium's Update Triggers User Tempest

Theme 3. Mac Studio Mania: Apple's Silicon Sparks AI Dreams (and Debate)

Theme 4. Agentic AI: OpenAI's Pricey Plans and Open Standards Emerge


PART 1: High level Discord summaries

Cursor IDE Discord


OpenAI Discord


Codeium (Windsurf) Discord


Unsloth AI (Daniel Han) Discord


aider (Paul Gauthier) Discord


LM Studio Discord


Perplexity AI Discord


Interconnects (Nathan Lambert) Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


Nomic.ai (GPT4All) Discord


HuggingFace Discord


Nous Research AI Discord


Stability.ai (Stable Diffusion) Discord


OpenRouter (Alex Atallah) Discord


Notebook LM Discord


Latent Space Discord


DSPy Discord


LlamaIndex Discord


Yannick Kilcher Discord


Eleuther Discord


Cohere Discord


tinygrad (George Hotz) Discord


Torchtune Discord


LLM Agents (Berkeley MOOC) Discord


Gorilla LLM (Berkeley Function Calling) Discord


AI21 Labs (Jamba) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Cursor IDE ▷ #general (1303 messages🔥🔥🔥):

Sonnet 3.7, Qwen, Windsurf IDE, MCP Client Closed, OpenRouter API

Links mentioned:


OpenAI ▷ #annnouncements (4 messages):

AGI development, OpenAI o1 and o3-mini, ChatGPT for macOS, AI safety and alignment


OpenAI ▷ #ai-discussions (822 messages🔥🔥🔥):

Grok3 vs Claude, DeepSeek, Atom of Thought, Microsoft Phi models, Apple unified memory

Links mentioned:


OpenAI ▷ #gpt-4-discussions (24 messages🔥):

GPT-4.5 Availability and Limitations, GPT-4.5 vs GPT-4o Performance, GPT-4.5 Prompting Strategies, GPT-4.5 Personalization Prompt, GPT-4.5 Mobile Issues


OpenAI ▷ #prompt-engineering (13 messages🔥):

Prompt Engineering Techniques Ontology, Sora AI Video Character Consistency, GPT-4o OCR Bounding Box Issues


OpenAI ▷ #api-discussions (13 messages🔥):

Prompt Engineering Survey, Sora AI videos character consistency, GPT-4o OCR results


Codeium (Windsurf) ▷ #announcements (2 messages):

Windsurf Wave 4, Cascade updates, Windsurf Previews, Auto-Linter in Cascade, MCP Server updates

Links mentioned:


Codeium (Windsurf) ▷ #discussion (45 messages🔥):

VS Code Commit Message Generation Issue, Flutterflow Usage, Codeium Uninstall, Codeium Language Server Download Issues, Telemetry Data in Codeium Chat Feature

Link mentioned: FAQ | Windsurf Editor and Codeium extensions: Find answers to common questions.


Codeium (Windsurf) ▷ #windsurf (609 messages🔥🔥🔥):

Windsurf Stability Issues, Credit Usage Concerns, Rollback Feature Request, 3.7 Performance Woes, Global Rules and .windsurfrules

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (424 messages🔥🔥🔥):

Phi-4-mini models support, Overfitting models on benchmarks, DeepSeek R1, Inventing random benchmarks, Flex Attention support

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (131 messages🔥🔥):

Qwen7b Memory Consumption, GRPO Success, TinyZero Replication, Llama 3.1, Hyperparameter Tuning

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (173 messages🔥🔥):

Deepseek distillation, Unsloth Windows support, Multi-GPU support, Qwen Coder, GRPO Training Problems

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (40 messages🔥):

Qwen-32B Release, RL Scaling for Medium-Sized Models, Cognitive Behaviors and LM Self-Improvement, Lossless Compression and Intelligence, AI21 Jamba for RAG

Links mentioned:


aider (Paul Gauthier) ▷ #announcements (1 messages):

Aider Product Hunt Launch

Link mentioned: Aider - AI Pair Programming in Your Terminal | Product Hunt: Aider is the AI pair programmer that edits code in your local git repo via the terminal. Works with your editor, any LLM (Claude 3.5 Sonnet, DeepSeek R1, GPT-4o, local models), and many languages.


aider (Paul Gauthier) ▷ #general (377 messages🔥🔥):

Grok 3, QwQ-32B, Mac Studio, OpenRouter throughput, Aider on Product Hunt

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (146 messages🔥🔥):

Aider connects to OpenWebUI, Litellm patches, DeepSeek token output, Aider commit messages, Trailing Whitespace

Links mentioned:


LM Studio ▷ #general (135 messages🔥🔥):

VRAM overflow, Phi-4 support, KV cache, New Mac Studio, Sesame TTS

Links mentioned:


LM Studio ▷ #hardware-discussion (309 messages🔥🔥):

Mac Studio M3 Ultra & M4 Max, AMD RX 9070 XT vs Nvidia RTX 5070 Ti, DeepSeek V2.5 236b, SGI machines, NVIDIA's RTX 5090 recall

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):

AI model settings, Claude 3.7 Sonnet


Perplexity AI ▷ #general (322 messages🔥🔥):

Auto model selection, Image source issue, Bulk text modification, Qwen Max model, Attached files in threads

Links mentioned:


Perplexity AI ▷ #sharing (11 messages🔥):

AI Health Assistant Debut, Nauru sells citizenship, Anthropic Valuation, Meme coins, Early Universe

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (4 messages):

API Focus, Sonar Pro Model, Search Cost Pricing, Real Time Web Data


Interconnects (Nathan Lambert) ▷ #news (138 messages🔥🔥):

Richard Sutton talk on Dynamic Deep Learning, OpenAI Agent Pricing, Custom Claude Code with Flash, LLMs for deobfuscation, Boston Dynamics vs Unitree

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (41 messages🔥):

LLMs play Diplomacy, Released model fathoms, Happy Birthday, Post training as a service, 14B img2vid model

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (5 messages):

Chinese Lewd R1 Aha Moment, DeepSeek Videos on bilibili Comments, Reinforcement Learning History by Schmidhuber

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (6 messages):

Schmidhuber, Deep RL, Richard Sutton Turing Award

Links mentioned:


Interconnects (Nathan Lambert) ▷ #reads (9 messages🔥):

Reinforcement Learning, Scientific AI, LLMs, Pre-training

Links mentioned:


Interconnects (Nathan Lambert) ▷ #lectures-and-projects (10 messages🔥):

RLHF Book, Lecture series videos


Interconnects (Nathan Lambert) ▷ #posts (9 messages🔥):

Stargate Project, Content Gating, Data Protection


Interconnects (Nathan Lambert) ▷ #policy (32 messages🔥):

Anthropic's Recommendations, Nationalizing Labs, DeepSeek Exports, China AMD GPUs, H20 Controls


GPU MODE ▷ #general (67 messages🔥🔥):

Touhou games and AI, RL gyms Starcraft gym and Minetest gym, Unified memory discussion, Thunderbolt 5 for distributed inference/training, Deepseek-R1


GPU MODE ▷ #triton (9 messages🔥):

tl.gather in Triton, PagedAttention in Triton, Bias addition optimization, Input size and performance issues, NVIDIA's Cooperative Vector

Link mentioned: Cannot call tl.gather · Issue #5826 · triton-lang/triton: Describe the bug When I run the following code I get an exception: AttributeError: module 'triton.language' has no attribute 'gather' import triton.language as tl tl.gather I've in...


GPU MODE ▷ #cuda (19 messages🔥):

CUDA compiler optimization, CUDA OpenGL interop segfault, Overlapping kernel execution with NCCL, Memory transaction size, GTC talk on maximizing memory bandwidth

Link mentioned: NVIDIA #GTC2025 Conference Session Catalog: Experience GTC 2025 In-Person and Online March 17-21, San Jose.


GPU MODE ▷ #torch (32 messages🔥):

Torch C++ Interface, Extending OffloadPolicy, use_reentrant in Activation Checkpointing, TorchBind API, Model-Based RL subgames

Links mentioned:


GPU MODE ▷ #cool-links (6 messages):

ThunderMLA, DeepSeek MLA, Modular's Democratizing AI Compute, CUDA Alternatives

Links mentioned:


GPU MODE ▷ #beginner (2 messages):

Triton tl.sort() problem, Flask API Authentication


GPU MODE ▷ #off-topic (14 messages🔥):

SSH pain points, Better GPU providers, Nitrokey, SoloKey, Yubikey


GPU MODE ▷ #irl-meetup (3 messages):

Tenstorrent, LlamaIndex, Koyeb, AI Infrastructure Meetup, Next-Gen hardware

Link mentioned: Next-Gen AI Infra with Tenstorrent & Koyeb @LlamaIndex · Luma: Join us for a special evening as we kick off a groundbreaking collaboration between Tenstorrent and Koyeb with our friends from LlamaIndex.This meetup is a…


GPU MODE ▷ #triton-puzzles (2 messages):

Reshaping vs Permuting, Triton Kernel Permutation, FPINT Dimension Right Shift


GPU MODE ▷ #rocm (7 messages):

Radeon GPU Profiler, ROCm on Linux, rocprofilerv2 ATT plugin, rocclr, PAL Backend


GPU MODE ▷ #tilelang (17 messages🔥):

Shared Memory Allocation, Python Linting Warnings, CUDA Compatibility Issues (12.1 vs 12.4/12.6), Github Issue #149

Link mentioned: Mismatched elements when performing matmul on CUDA 12.4/12.6 · Issue #149 · tile-ai/tilelang: Describe the Bug I ran the simple matmul code below, and I got error AssertionError: Tensor-likes are not close! The code works fine on CUDA 12.1, but not on CUDA 12.4/12.6. The number of mismatche...


GPU MODE ▷ #metal (1 messages):

M3 Ultra, Unified Memory, Creative Uses of Unified Memory


GPU MODE ▷ #reasoning-gym (20 messages🔥):

ARC-AGI Competition, QwQ-32B Release, Reasoning-Gym Datasets, LADDER Framework

Links mentioned:


GPU MODE ▷ #gpu模式 (6 messages):

AI directions for programmers, CUDA WeChat groups, TileLang DSL, Triton sequence padding

Link mentioned: GitHub - tile-ai/tilelang: Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels: Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels - tile-ai/tilelang


GPU MODE ▷ #submissions (20 messages🔥):

Modal Runners, GPU Leaderboards, Submission Errors


GPU MODE ▷ #status (1 messages):

Timeout durations


Modular (Mojo 🔥) ▷ #general (217 messages🔥🔥):

Mojo's dynamism, Mutating classes in Mojo, Python vs Mojo, Async Django drawbacks

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (5 messages):

Mojo/Python project benchmarking, Mojo/Python project folder structure, Python.add_to_path alternatives, Symlink alternatives in Mojo tests

Link mentioned: Mojo/Python project folder structure: I originally posted this on Discord (link), but @DarinSimmons felt it would make a good topic for this forum. I’m looking for guidance on folder organization for a significant Mojo/Python project. I’...


Modular (Mojo 🔥) ▷ #max (2 messages):

Modular website, Broken anchor links


Nomic.ai (GPT4All) ▷ #general (217 messages🔥🔥):

MiniCheck-Flan-T5-Large fact-checking model, Qwen 32B model and GGUF quantizations, Local AI and GPT4All limitations, Persisting user data for AI agents

Links mentioned:


HuggingFace ▷ #general (66 messages🔥🔥):

Local model execution, Hugging Face Pro Plan, Object Detection, Multi-GPU setup, Fraud detection

Links mentioned:


HuggingFace ▷ #today-im-learning (7 messages):

Kornia Rust library internships, LLM Guardrails benchmarking, Spikee framework

Link mentioned: Google Summer of Code: Google Summer of Code is a global program focused on bringing more developers into open source software development.


HuggingFace ▷ #cool-finds (4 messages):

Flash Attention, SAT dataset, Q-Filters for KV Cache compression

Links mentioned:


HuggingFace ▷ #i-made-this (7 messages):

VisionKit, Deepseek-r1, TS-Agents, FastRTC, diRAGnosis

Links mentioned:


HuggingFace ▷ #computer-vision (2 messages):

DINOv2 fine-tuning, Weakly labeled images, Pose estimation, Hugging Face Computer Vision Hangout

Links mentioned:


HuggingFace ▷ #NLP (2 messages):

Decoder Masking Mechanisms, Inference in Decoder-Only Models, Attention Mechanisms


HuggingFace ▷ #smol-course (5 messages):

Reasoning Course, hf ecosystem, fine-tuning, telemetry, langfuse


HuggingFace ▷ #agents-course (109 messages🔥🔥):

Agentic AI vs AI Agents, SmolAgents with local LLM, HuggingFace inference API rate limits, HuggingFace course certificates, OpenRouter Free Model

Links mentioned:


Nous Research AI ▷ #general (175 messages🔥🔥):

Gaslight Benchmark, GPT-4.5 image generation, Video AI Prompting, Hermes Special Tokens, Alibaba QwQ 32b Model vs DeepSeek R1

Links mentioned:


Nous Research AI ▷ #ask-about-llms (2 messages):

``


Nous Research AI ▷ #interesting-links (4 messages):

QwQ-32B, DeepSeek R1, Reinforcement Learning Scaling, Tool Calling Syntax, Hermes Format

Link mentioned: QwQ-32B: Embracing the Power of Reinforcement Learning: QWEN CHAT Hugging Face ModelScope DEMO DISCORDScaling Reinforcement Learning (RL) has the potential to enhance model performance beyond conventional pretraining and post-training methods. Recent studi...


Stability.ai (Stable Diffusion) ▷ #general-chat (135 messages🔥🔥):

Hand Fixing in SDXL, Free Video Creation from a Photo, Local vs SORA Video Generation Costs, SD3.5 Large TurboX Release, Running Stable Diffusion on GPU vs CPU

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

QwQ 32B Model, Reasoning Update, OAuth User ID, GitHub Authentication, OpenAI Provider Downtime

Link mentioned: Discord: no description found


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

Android Chat App, Customizable LLMs, OpenRouter Integration, Speech To Text, Text To Image

Link mentioned: Releases · Ayuilos/Taiga: Taiga is an open-source mobile AI chat app that supports customizing LLM providers. - Ayuilos/Taiga


OpenRouter (Alex Atallah) ▷ #general (112 messages🔥🔥):

OpenRouter API issues, Deepseek instruct format, Mistral OCR launch, Usage based charging app, Default prompt feature

Links mentioned:


Notebook LM ▷ #announcements (1 messages):

User Research, Gift Cards, NotebookLM

Link mentioned: Register your interest: NotebookLM feedback: Hello,We are looking for feedback on NotebookLM via a 15 min or 60 minute remove interview.This feedback will help the Google team improve NotebookLM for future enhancements. To apply to participate, ...


Notebook LM ▷ #use-cases (17 messages🔥):

Gemini struggles, NotebookLM PDF support, NotebookLM API, NotebookLM online games, NotebookLM documentation


Notebook LM ▷ #general (67 messages🔥🔥):

Android App, Response Lengths, Formulas on NotebookLM, File Upload Issues, Exporting Notes as PDF

Links mentioned:


Latent Space ▷ #ai-general-chat (77 messages🔥🔥):

Claude Cost, New MacBook Air, Qwen 32B, React for Agents, Nicholas Carlini joins Anthropic

Links mentioned:


DSPy ▷ #show-and-tell (41 messages🔥):

Synalinks framework, Async optimization, Constrained structured output, Functional API, Graph-based RAG

Links mentioned:


DSPy ▷ #general (22 messages🔥):

DSPy optimization for intent classification, Comparing Texts for Contradictions using DSPy, DSPy's Adapter system for structured outputs, Straggler threads fix in DSPy, Variable output fields in dspy.Signature


LlamaIndex ▷ #blog (2 messages):

Agentic Document Workflows, Interoperable Agent Standards

Link mentioned: Outshift | Building the Internet of Agents: Introducing AGNTCY.org: Learn about the latest tech innovations and engage in thought leadership news from Cisco.


LlamaIndex ▷ #general (58 messages🔥🔥):

LlamaIndex ImageBlock Issues with OpenAI, Query Fusion Retriever Citation Issues, Distributed AgentWorkflow Architecture, Profiling/Timing of Agent Execution, Memory Consumption with Flask and Gunicorn

Links mentioned:


Yannick Kilcher ▷ #general (21 messages🔥):

Bilevel Optimization, Sparsemax, Model Checkpoints with DDP, Compositmax

Link mentioned: Tweet from Jürgen Schmidhuber (@SchmidhuberAI): Congratulations to @RichardSSutton and Andy Barto on their Turing award!


Yannick Kilcher ▷ #paper-discussion (9 messages🔥):

Proactive T2I Agents, User Prompt Underspecification, Belief Graph Editing, Bash Shell Puns

Links mentioned:


Yannick Kilcher ▷ #ml-news (14 messages🔥):

AMD FSR 4 vs DLSS, Alibaba Qwen QwQ-32B Model, Cortical Labs Biological Computer, Neuron Cocaine/LSD experiments

Links mentioned:


Eleuther ▷ #general (2 messages):

Introductions, AI Biohacking, Machine Unlearning


Eleuther ▷ #research (4 messages):

ARC Training, Lossless Compression, Relative Entropy Coding (REC), Encoder-Free Sample Dependent VAE

Links mentioned:


Eleuther ▷ #scaling-laws (16 messages🔥):

Pythia Loss Curves, Kaplan-style loss vs compute convex hull plot, FLOPs PPO uses per token


Eleuther ▷ #interpretability-general (9 messages🔥):

Intermediate layer outputs to vocab space, Tuned Lens vs Logit Lens

Link mentioned: Eliciting Latent Predictions from Transformers with the Tuned Lens: We analyze transformers from the perspective of iterative inference, seeking to understand how model predictions are refined layer by layer. To do so, we train an affine probe for each block in a froz...


Eleuther ▷ #lm-thunderdome (11 messages🔥):

lm-eval AIME support, ARC-Challenge tasks, discrepancy in the scores, vllm's implementation


Cohere ▷ #「💬」general (23 messages🔥):

Enterprise Deployment, B2B Lead Times, Community Feedback


Cohere ▷ #【📣】announcements (1 messages):

Aya Vision, Multilingual AI, Multimodal Models, Open-Weights Model, AyaVisionBenchmark

Links mentioned:


Cohere ▷ #「🔌」api-discussions (1 messages):

Cohere Reranker v3.5 Latency


Cohere ▷ #「💡」projects (1 messages):

Mindmap Generation, Pretrained Models, Mathematical Models


Cohere ▷ #「🤝」introductions (2 messages):

Introductions, Sales Contact


tinygrad (George Hotz) ▷ #general (9 messages🔥):

ShapeTracker merging proof, 96GB 4090 on Taobao, Rust CubeCL

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (3 messages):

RANGE Op, iGPU detection on Linux


Torchtune ▷ #general (7 messages):

HF Checkpoints, special_tokens.json, TorchTune Checkpointer, Github Stars

Links mentioned:


Torchtune ▷ #dev (3 messages):

GRPO recipe, Memory issues, Excessive torch.cuda.empty_cache()


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (4 messages):

Berkeley vs MOOC lectures, Certificate Declaration Forms


Gorilla LLM (Berkeley Function Calling) ▷ #leaderboard (2 messages):

AST Metric Definition, V1 Dataset Construction


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

Gemini 2, GPT o3-high, Deepseek R1, Prompt tool calling, Python tool


AI21 Labs (Jamba) ▷ #general-chat (1 messages):

Jamba 1.6 launch, Open model for enterprise deployment, Jamba 1.6 performance benchmarks, Hybrid SSM-Transformer architecture, 256K context window

Links mentioned:


{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}