Frozen AI News archive

not much happened today

**OpenAI** announced that **o3** and **o4-mini** models will be released soon, with **GPT-5** expected in a few months, delayed for quality improvements and capacity planning. **DeepSeek** introduced **Self-Principled Critique Tuning (SPCT)** to enhance inference-time scalability for generalist reward models. **Anthropic's Sonnet 3.7** remains a top coding model. **Google's Gemma 3** is available on KerasHub, and **Qwen 2.5 VL** powers a new Apache 2.0 licensed OCR model. **Gemini 2.5 Pro** entered public preview with increased rate limits and pricing announced, becoming a preferred model for many tasks except image generation. Meta's architectural advantage and the **FrontierMath benchmark** challenge AI's long-form reasoning and worldview development. Research reveals LLMs focus attention on the first token as an "attention sink," preserving representation diversity, demonstrated in **Gemma 7B** and **LLaMa 3.1** models. **MegaScale-Infer** offers efficient serving of large-scale Mixture-of-Experts models with up to **1.90x higher per-GPU throughput**.

Canonical issue URL

AI News for 4/3/2025-4/4/2025. We checked 7 subreddits, 433 Twitters and 30 Discords (230 channels, and 7491 messages) for you. Estimated reading time saved (at 200wpm): 629 minutes. You can now tag @smol_ai for AINews discussions!

It's been a quiet week, so why not fill out the AI Engineer World's Fair Call For Speakers?

Tracks across:

Apply here!


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Model Releases and Announcements

Gemini 2.5 Pro

AI Model Capabilities and Benchmarks

AI Applications and Tools

Langchain and Graph Updates

Other

Humor and Memes


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. "Advancements in Generalist Reward Models Unveiled"

Theme 2. "Building High-Performance GPU Servers on a Budget"

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. "Advancements in Long Context AI Models"

Theme 2. "Unlocking AI Innovations: Art, Animation, and Pricing"

Theme 3. "Unlocking AI: Models, Hardware, and Hilarious Pranks"


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: Model Mania - Releases, Rankings, and Reasoning

Theme 2: Fine-Tuning Frustrations & Hardware Hurdles

Theme 3: Tooling Triumphs & Workflow Wonders

Theme 4: Research Ruminations & Conceptual Conundrums

Theme 5: Platform Problems & Policy Puzzles


PART 1: High level Discord summaries

LMArena Discord


Manus.im Discord Discord


Unsloth AI (Daniel Han) Discord


Interconnects (Nathan Lambert) Discord


OpenAI Discord


Latent Space Discord


Cursor Community Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


HuggingFace Discord


Nous Research AI Discord


GPU MODE Discord


MCP (Glama) Discord


Notebook LM Discord


Eleuther Discord


Nomic.ai (GPT4All) Discord


Torchtune Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


Cohere Discord


DSPy Discord


Codeium (Windsurf) Discord


Gorilla LLM (Berkeley Function Calling) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

LMArena ▷ #general (1329 messages🔥🔥🔥):

Faster Inference vs Smarter Models, Context Length Limits, Distilling Models, Super Fast Models, LLMs and Sentience

Links mentioned:


Manus.im Discord ▷ #general (852 messages🔥🔥🔥):

Manus credits, Open Manus GUI, Gemini vs. Claude, Prompt engineering tips, Alternative AI tools

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (245 messages🔥🔥):

VRAM Price Justification, 4-bit QAT, ZeroDivisionError with phi-4, Training Loss value for Llama3.2, Phi-4 Model Troubles

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (8 messages🔥):

Vibe coding, Jailbreaking 4o, ChatGPT uncensored


Unsloth AI (Daniel Han) ▷ #help (211 messages🔥🔥):

Gemma3 Profiling OOM, GRPO Co-training Multiple Models, Fine-tuning LLaMA3.1 w/ Token IDs, Unsloth Pro Release, Hugging Face Packing Bug

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (14 messages🔥):

Naming Conventions, Dynamic Quantization, Unsloth Models


Unsloth AI (Daniel Han) ▷ #research (16 messages🔥):

GRPO approach, reward functions, multi-reward system, reward hacking, open source LLM for Spanish


Interconnects (Nathan Lambert) ▷ #news (368 messages🔥🔥):

Open Source SSM, Microsoft data center plans, Stealth Model on OpenRouter, Perplexity funding round, GPT-5 release schedule

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (12 messages🔥):

Video camera setup for remote talks, Deepseek chains of thought, Sam Altman releases o3 and o4-mini, LlamaCon

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (15 messages🔥):

Claude's coding ability, Polars library, Context condensation issues, Scaling plots meme


Interconnects (Nathan Lambert) ▷ #rl (11 messages🔥):

dr grpo intuition, RL introduction with GPT4.5, Policy Gradient, GRPO training rollouts


Interconnects (Nathan Lambert) ▷ #reads (18 messages🔥):

Dwarkesh Patel scaling laws, Inference-time scalability of generalist RM, GPT-4o diffusion head, Building an efficient GPU server with RTX 4090s/5090s, OpenCodeReasoning dataset

Links mentioned:


OpenAI ▷ #ai-discussions (160 messages🔥🔥):

GPT-4o Rate Limits, MS Account profile pics, Copilot Event reaction, Copilot in VSCode explores consciousness, Veo 2 spotted in Gemini Advanced

Link mentioned: Discord: no description found


OpenAI ▷ #gpt-4-discussions (5 messages):

OpenAI Support, Account Issues, Red Team Supervision


OpenAI ▷ #prompt-engineering (90 messages🔥🔥):

OpenAI content policies, Adult content, Model Spec vs Usage Policies, Moderation endpoint, OpenAI's stance on adult toys

Link mentioned: OpenAI Model Spec: The Model Spec specifies desired behavior for the models underlying OpenAI's products (including our APIs).


OpenAI ▷ #api-discussions (90 messages🔥🔥):

OpenAI Content Policies, Model Spec vs Content Policies, Generating Adult Content, Moderation Endpoint, Internal Discord White Message Boxes


Latent Space ▷ #ai-general-chat (74 messages🔥🔥):

Anthropic Dev Conference, Biz Dev Tools, OpenRouterAI stealth model, Devin 2.0 price slash, A16Z 8x RTX 4090 GPU workstation

Links mentioned:


Latent Space ▷ #ai-in-action-club (255 messages🔥🔥):

LLM Codegen Workflow, Cursor vs Windsurf, Gemini Pro Hallucinations, File Forge and RepoMix, Cursor Context Management

Links mentioned:


Cursor Community ▷ #general (252 messages🔥🔥):

Cursor monthly subscription, Files on disk not updating, Cursor.so email legitimacy, Gemini pricing, GPT-5 release

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

OpenRouter Fallback parameter, OpenRouter models array


OpenRouter (Alex Atallah) ▷ #app-showcase (2 messages):

OpenRouter API, Cloudflare AI Gateway, Missile Command game AI, Gameplay AI summary analysis, gemini-2.5-pro atari

Link mentioned: Missile Command: no description found


OpenRouter (Alex Atallah) ▷ #general (239 messages🔥🔥):

Quasar vs Gemini 2.5, OpenRouter Stealth Logging, DeepSeek Pricing, Quasar Alpha Errors, Gemini 2.5 Pro Availability

Links mentioned:


LM Studio ▷ #general (48 messages🔥):

Gemma 3 4b CUDA error, Importing models from HuggingFace to LM Studio, Run LM Studio model locally on n8n instance, Ollama Models incompatibility with LM Studio, LM Studio roadmap

Link mentioned: Import Models | LM Studio Docs: Use model files you've downloaded outside of LM Studio


LM Studio ▷ #hardware-discussion (61 messages🔥🔥):

LM Studio VRAM prediction, M-series Mac vs NVIDIA 4090 for LLM inference, Mixed GPU systems with LM Studio, Reka Flash 21B vs Gemma3 27, Fine-tuning on Nvidia vs Inference on Mac


Modular (Mojo 🔥) ▷ #general (48 messages🔥):

Mojo vs C SIMD intrinsics, EmberJson Library, Sonic-cpp Library, Modular stdlib, magic package manager

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (57 messages🔥🔥):

Python wrappers for Mojo, Mojo arbitrary-precision integers, NDBuffer instance creation, Copyable types in Mojo, MLIR regions in Mojo

Links mentioned:


Yannick Kilcher ▷ #general (52 messages🔥):

Google's competitive advantages, Dynamic vs Static Architectures, Token Embeddings and Manifold Hypothesis, RL-driven Diffusion Model

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (9 messages🔥):

Math PhD AI questions, o1-pro AI model, Variational Diffusion Models, Stochastic Differential Equations, Stable Diffusion paper

Link mentioned: Variational Diffusion Models: Diffusion-based generative models have demonstrated a capacity for perceptually impressive synthesis, but can they also be great likelihood-based models? We answer this in the affirmative, and introdu...


Yannick Kilcher ▷ #ml-news (28 messages🔥):

GPT-4o release, Stability AI's Stable Virtual Camera, Claude vs. OpenAI benchmarks, Apache Parquet RCE vulnerability, OpenAI's GPT-5 plans

Links mentioned:


HuggingFace ▷ #general (61 messages🔥🔥):

RAG implementation code size, Hugging Face Spaces port restrictions, London, Paris, Berlin AI HackXelerator, Zero GPU Quota, InferenceClient with a local model

Links mentioned:


HuggingFace ▷ #today-im-learning (2 messages):

LangGraph units


HuggingFace ▷ #i-made-this (1 messages):

ZeroGPU, Sentence Transformers, Azure SQL DB vector features, DBA Scripts

Link mentioned: Sqlserver Lib Assistant - a Hugging Face Space by rrg92: no description found


HuggingFace ▷ #smol-course (3 messages):

ApiModel class extension for free providers (g4f), GeoCoding API, ISO 3166-1 alpha-2 code for the country, LLM and alpha-2 codes


HuggingFace ▷ #agents-course (9 messages🔥):

Gradio Version, Multi-Agent System vs Single-Agent System, Inference Monthly Credits, Local Model Solution, BraveSearch API

Links mentioned:


Nous Research AI ▷ #general (54 messages🔥):

AI Prompt Filmmaking, Runway Gen 4, Alibaba Wan 2.2, Devin 2.0 IDE, Llama 4

Links mentioned:


Nous Research AI ▷ #ask-about-llms (8 messages🔥):

LLMs for extraction, Genstruct 7B, OllamaGenstruct, Deepseek API, OLMo and Mistral for PDFs

Links mentioned:


Nous Research AI ▷ #research-papers (2 messages):

Deepseek new paper, Reinforcement Learning for LLMs, Inference-time scalability of generalist RM, Self-Principled Critique Tuning (SPCT)

Link mentioned: Inference-Time Scaling for Generalist Reward Modeling: Reinforcement learning (RL) has been widely adopted in post-training for large language models (LLMs) at scale. Recently, the incentivization of reasoning capabilities in LLMs from RL indicates that $...


Nous Research AI ▷ #interesting-links (3 messages):

Camel Matrix AI, Claude Squad

Links mentioned:


Nous Research AI ▷ #research-papers (2 messages):

Deepseek, Reinforcement Learning, Reward Modeling

Link mentioned: Inference-Time Scaling for Generalist Reward Modeling: Reinforcement learning (RL) has been widely adopted in post-training for large language models (LLMs) at scale. Recently, the incentivization of reasoning capabilities in LLMs from RL indicates that $...


GPU MODE ▷ #general (15 messages🔥):

GPU vs CPU, GPRO Model Compilation Speed, Computer Architecture Book Recommendation


GPU MODE ▷ #triton (2 messages):

Triton index backward op implementation, tl.make_block_ptr() usage, atomic_add performance in Triton


GPU MODE ▷ #cuda (13 messages🔥):

cuBLAS occupancy, CUDA debugging over SSH, cuTILS release date, nvshmem + MPI race conditions


GPU MODE ▷ #torch (4 messages):

Warmup Iteration, Pytorch Model, GPU memory, Inference on two separate batches, Streams


GPU MODE ▷ #cool-links (6 messages):

Cerebras, Hardware vendor tier list, Blackwell, Deeper hardware dives


GPU MODE ▷ #jobs (1 messages):

AI Engineer, Agentic LLM Startup, RAG, Python, Tensorflow


GPU MODE ▷ #beginner (7 messages):

C vs C++ in CUDA, Centralized GPU programming languages, OpenCL's lack of mainstream adoption, ROCm and HIP support across vendors, GPU Architecture variations


GPU MODE ▷ #irl-meetup (5 messages):

SoCal/San Diego events, ICLR 2025 in Singapore, Silicon Valley meetups this summer, SF Meetups


GPU MODE ▷ #rocm (3 messages):

hipcc Casting, rocblas_gemm_ex with hipMallocManaged


GPU MODE ▷ #self-promotion (2 messages):

CUDA Kernel Design, URDF Visualizer with AI

Links mentioned:


GPU MODE ▷ #reasoning-gym (9 messages🔥):

ReasoningGymDataset Definitions, LLM-based RL Frameworks, Training Models with RG Data

Link mentioned: reasoning-gym/training/utils/datasets.py at main · open-thought/reasoning-gym: procedural reasoning datasets. Contribute to open-thought/reasoning-gym development by creating an account on GitHub.


GPU MODE ▷ #submissions (1 messages):

Leaderboard Submission Success, Modal Runners on B200


MCP (Glama) ▷ #general (53 messages🔥):

MCP Clients vs Servers, MCP and React Code Generation, MCP learning resources, OAuth in MCP, Streamable HTTP for MCP Servers

Links mentioned:


MCP (Glama) ▷ #showcase (7 messages):

Datadog MCP, MCP Browser Kit, MCP Tool Poisoning, MCP Server Search, MCP-K8s Server

Links mentioned:


Notebook LM ▷ #announcements (1 messages):

User Feedback, Study Participants


Notebook LM ▷ #use-cases (7 messages):

IntentSim.org, D&D sessions in NotebookLM, Seinfeld duo on GenAI

Links mentioned:


Notebook LM ▷ #general (38 messages🔥):

Deeper Cognitive Capacity of NotebookLM, PDF Understanding Enhancement, Discover new sources within NotebookLM, Deep Search features rollout, ImageMaps or mind maps with images


Eleuther ▷ #general (9 messages🔥):

Startup for Scaling AI Ideas, Decline in Interesting Research, Non-Agentic AI Research, RAG Evaluation with lm-evaluation-harness


Eleuther ▷ #research (6 messages):

OpenThoughts-1M, OpenThinker2-32B/7B, Ludwig Schmidt, Bespokelabs, LAION

Links mentioned:


Eleuther ▷ #scaling-laws (2 messages):

Inference Scaling Laws, Test-Time Scaling, Language Model Power Laws, Mathematical Problem Solving with LLMs, Multimodal Jailbreaking

Links mentioned:


Eleuther ▷ #interpretability-general (9 messages🔥):

Steering Vector Composition, Dynamic Activation Composition, Learned Steering Vectors, Function Vectors

Links mentioned:


Eleuther ▷ #lm-thunderdome (14 messages🔥):

lm-eval-harness EOS token, Huggingface tokenization, encode_pair changes

Link mentioned: lm-evaluation-harness/lm_eval/api/model.py at 11ac352d5f670fa14bbce00e423cff6ff63ff048 · EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness


Nomic.ai (GPT4All) ▷ #general (23 messages🔥):

Chat reorganization, Lightweight model for price extraction, GPT4All's Quietness, Gemini 2.5 Pro for coding and math, Migrating data between SSDs


Torchtune ▷ #dev (18 messages🔥):

Packed Datasets, Chunking Responsibility, NeMo's Resilient Training

Link mentioned: fix: Timeout crash because of chunked_output len by bogdansalyp · Pull Request #2560 · pytorch/torchtune: ContextWhat is the purpose of this PR? Is it to add a new feature fix a bug update tests and/or documentation other (please add here)Please link to any issues this PR addresses - closes #25...


Torchtune ▷ #papers (2 messages):

AI-2027 report, superhuman AI impact

Link mentioned: AI 2027: A research-backed AI scenario forecast.


tinygrad (George Hotz) ▷ #general (13 messages🔥):

leetgpu tinygrad support, Huawei Ascend cards, WEBGPU BEAM limitations, maxComputeInvocationsPerWorkgroup issue

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

Distinguishable Instances, tinygrad Karpathy GPT Reimplementation, Metal Buffer Limit


LlamaIndex ▷ #blog (1 messages):

Multimodal Chat History, Multi-Agent Systems


LlamaIndex ▷ #general (7 messages):

PatentsView API, Workflow to Tool transformation


LlamaIndex ▷ #ai-discussion (4 messages):

LlamaParse, LVM, image processing


Cohere ▷ #「💬」general (4 messages):

AYA vision errors, AWS Bedrock


Cohere ▷ #「🤝」introductions (4 messages):

Full-Stack Developer Introduction, Product Analyst Exploring AI Writing, Web3/AI Engineer Introduction


DSPy ▷ #general (1 messages):

Asyncio Support for DSPy


Codeium (Windsurf) ▷ #announcements (1 messages):

DeepSeek-V3 Upgrade

Link mentioned: Tweet from Windsurf (@windsurf_ai): DeepSeek-V3 has now been upgraded to DeepSeek-V3-0324. It's still free!


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

robotsail: Np! Let me know if you have any questions or need me to change/retest anything




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}