Frozen AI News archive

Gemini 2.5 Pro + 4o Native Image Gen

**Gemini 2.5 Pro** from **Google DeepMind** has become the new top AI model, surpassing **Grok 3** by 40 LMarena points, with contributions from **Noam Shazeer** integrating Flash Thinking techniques. It is available as a free, rate-limited experimental model. Meanwhile, **OpenAI** released **GPT 4o Native Images**, an autoregressive image generation model with detailed insights shared by **Allan Jabri** and credits to **Gabe Goh**. Gemini 2.5 Pro excels in reasoning, coding, STEM, multimodal tasks, and instruction following, topping the LMarena leaderboard significantly. It is accessible via Google AI Studio and the Gemini App.

Canonical issue URL

AI News for 3/24/2025-3/25/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (228 channels, and 6171 messages) for you. Estimated reading time saved (at 200wpm): 566 minutes. You can now tag @smol_ai for AINews discussions!

Both frontier lab releases from today were title page worthy, so they will have to share space.

Gemini 2.5 Pro

Gemini 2.5 Pro is the new undisputed top model in the world, a whopping 40 LMarena points over Grok 3 from just last month (our coverage here), with Noam Shazeer's involvement hinting that the learnings from Flash Thinking have been merged into Pro (odd how 2.5 Pro came out first before 2.5 Flash?)

image.png

Simon Willison, Paul Gauthier (aider), Andrew Carr and others all have worthwhile quick hits to the theme of "this model is SOTA".

Pricing is not yet announced but you can use it as a free, rate limited "experimental model" today.

GPT 4o Native Images

Hot on the heels of yesterday's Reve Image and Gemini's Native Image Gen, OpenAI finally released their 4o native image gen with a livestream, blogpost, and system card confirming that it is an autoregressive model. The most detail we'll probably get from now about how it works, is this image from Allan Jabri who worked on the original 4o image gen that was never released (then taken over by Gabe Goh as sama credits him).

image.png

A wide image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a tshirt wiith a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer's reflection. The text reads: (left) "Transfer between Modalities: Suppose we directly model p(text, pixels, sound) [equation] with one big autoregressive transformer. Pros: * image generation augmented with vast world knowledge * next-level text rendering * native in-context learning * unified post-training stack Cons: * varying bit-rate across modalities * compute not adaptive" (Right) "Fixes: * model compressed representations * compose autoregressive prior with a powerful decoder" On the bottom right of the board, she draws a diagram: "tokens -> [transformer] -> [diffusion] -> pixels"


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Model Releases and Announcements

Benchmarks and Performance Evaluations

AI Applications and Tools

Research and Development

AI Ethics and Societal Impact

Humor and Miscellaneous


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek V3 0324 Tops Non-Reasoning Model Charts

Theme 2. Dynamic Quants for DeepSeek V3 Boost Deployments

Theme 3. Gemini 2.5 Pro Dominates Benchmarks with New Features

Theme 4. Affordable AI Hardware: Phi-4 Q4 Server Builds

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. DeepSeek V3 Outperforming GPT-4.5 in New Benchmarks

Theme 2. OpenAI 4o Revolutionizing Image Generation

Theme 3. OpenAI's Enhanced AI Voice Chat Experience


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. Gemini 2.5 Pro: Benchmarks Blasted, Arena Annihilated

Theme 2. DeepSeek V3: Coding Champ and Reasoning Renegade

Theme 3. Context is King: Tools and Techniques for Managing LLM Memory

Theme 4. Image Generation Gets a 4o-verhaul and New Challenger Emerges

Theme 5. Quantization and Optimization: Squeezing More from LLMs


PART 1: High level Discord summaries

LMArena Discord


Perplexity AI Discord


Cursor Community Discord


OpenAI Discord


aider (Paul Gauthier) Discord


Unsloth AI (Daniel Han) Discord


Interconnects (Nathan Lambert) Discord


OpenRouter (Alex Atallah) Discord


Nous Research AI Discord


Latent Space Discord


GPU MODE Discord


HuggingFace Discord


MCP (Glama) Discord


Notebook LM Discord


LM Studio Discord


Cohere Discord


Torchtune Discord


Nomic.ai (GPT4All) Discord


LlamaIndex Discord


Eleuther Discord


Modular (Mojo 🔥) Discord


DSPy Discord


tinygrad (George Hotz) Discord


Codeium (Windsurf) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

LMArena ▷ #general (916 messages🔥🔥🔥):

Nebula vs other models, Gemini 2.5 models are out, Grok 3 issues, Llama 4 release, DeepSeek R2

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):

Perplexity answer modes, vertical search, web and mobile


Perplexity AI ▷ #general (991 messages🔥🔥🔥):

Perplexity downtime, Electron vs native apps, DeepSeek V3, Image Generation, AI Models for Accuracy

Links mentioned:


Perplexity AI ▷ #sharing (7 messages):

Perplexity AI searches, AI Analysis, Coinmarketcap API, Aircraft Material


Perplexity AI ▷ #pplx-api (11 messages🔥):

truncated responses, Sonar Model, Perplexity Pro API credits, Sonar Pro in iOS app, API request costs


Cursor Community ▷ #general (872 messages🔥🔥🔥):

Augment vs Cursor codebase analysis, Claude 3.7 MAX vs Claude 3.7, Vibe Coder & MCPs, New Deepseek V3 and Gemini 2.5, ASI Singularity

Links mentioned:


OpenAI ▷ #annnouncements (2 messages):

4o image generation, ChatGPT, Sora


OpenAI ▷ #ai-discussions (300 messages🔥🔥):

GPT-4o mini vs Gemini 2.0 Flash for property tag extraction, Operator OAI expansion plans, GPT context window limitations and hallucinations, Best AI models for various tasks, DeepSeek V3 03-24's playful behavior

Links mentioned:


OpenAI ▷ #gpt-4-discussions (2 messages):

Chat GPT Speed Degradation


OpenAI ▷ #prompt-engineering (159 messages🔥🔥):

Proprietary prompt technique for memory retention, Benchmarking AI performance, Open sourcing prompts, Using python to prompt chatgpt, Building custom GPTs


OpenAI ▷ #api-discussions (159 messages🔥🔥):

Proprietary AI system, Dynamic cognitive architecture, Runtime OS through prompt, Large context maintenance, GPL release discussion


OpenAI ▷ #api-projects (1 messages):

FormulaGPT, AI Racing Simulator, Open Source AI, AI Strategy Decisions

Link mentioned: GitHub - dawid-maj/FormulaGPT: FormulaGPT – AI-powered Formula 1 race simulator with real-time team management and strategy decisions.: FormulaGPT – AI-powered Formula 1 race simulator with real-time team management and strategy decisions. - dawid-maj/FormulaGPT


aider (Paul Gauthier) ▷ #announcements (1 messages):

Gemini 2.5 Pro support, DeepSeek V3 0324 support, aider /context command, aider /edit alias, Claude 3.7 Sonnet 'overeager' mode


aider (Paul Gauthier) ▷ #general (532 messages🔥🔥🔥):

DeepSeek V3 performance, Gemini 2.5 Pro release, GPT-4o image generation, aider /context command

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (39 messages🔥):

Deepseek API Usage with Aider, LLM Hallucinations, Aider's Architecture Mode, NotebookLM for Context Priming, OpenRouter Configuration Issues

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (284 messages🔥🔥):

HF Transformers, Deepseek model patching, FP8 Loading Scheme, Quantization impact on model performance, GGUF Uploads

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (11 messages🔥):

Funny Graphs in CS Research, Benchmark Dissatisfaction, AI Project Recruitment


Unsloth AI (Daniel Han) ▷ #help (66 messages🔥🔥):

Deepseek facts learning, Unsloth Gemma 3 27b error, phi4 fine tuning with unsloth, Medical bot fine tuning, LightRAG issues

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

GRPO on AWS, Tensorfuse, LoRA modules

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (13 messages🔥):

GRPO limitations, FFN Fusion for LLMs, DAPO experiments, AI Project Job Spam, Transformer Talk

Link mentioned: Paper page - FFN Fusion: Rethinking Sequential Computation in Large Language Models: no description found


Interconnects (Nathan Lambert) ▷ #news (232 messages🔥🔥):

Qwen VL series training details, Qwerky-72B transformerless model, Gemini 2.5 Pro performance, 4o image generation, AI Studio vs Gemini Advanced

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (26 messages🔥):

Labs hillclimb benchmarks, Home inference LLMs, vLLM, OpenRouter, Model Evals

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (1 messages):

Political Favor, American Politics, Business Strategy


Interconnects (Nathan Lambert) ▷ #random (28 messages🔥):

AI Threads lists on social media, Verification as key to AI, MCP malware reverse engineering

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (52 messages🔥):

Gooning Urban Dictionary meaning, DeepSeek-LLM and DeepSeek-MoE, Mistral Small Versioning Confusion, GPT4o Image Generation vs Gemini, GRPO implementation in trl

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (14 messages🔥):

DPO vs Sampling, RL training and Entropy, SimpleRL-Zoo, DAPO Objective

Link mentioned: Tweet from Junxian He (@junxian_he): Two months ago, we open-sourced the first R1-like zero RL training project on math with the Qwen2.5-math model. Since then, many great works performed successful zero RL training, mostly based on Qwen...


Interconnects (Nathan Lambert) ▷ #reads (2 messages):

Claude Code, Anthropic, Anysphere's Cursor, Codium's Windsurf, npm package

Links mentioned:


Interconnects (Nathan Lambert) ▷ #lectures-and-projects (2 messages):

Claude PR, Header Copy Links

Link mentioned: (experimental) Add heading anchor links for easy section linking by natolambert · Pull Request #82 · natolambert/rlhf-book: Add copyable links to all headings that appear on hoverLinks copy the current URL with fragment identifier to clipboardAdd CSS for styling the anchor linksUpdate Makefile to copy new JS file to ...


OpenRouter (Alex Atallah) ▷ #announcements (3 messages):

Anthropic incident, Claude 3.7 Sonnet endpoints, Zero-Token Insurance, Google Gemini 2.5 Pro Experimental

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (196 messages🔥🔥):

Deepseek performance issues, Gemini 2.5 Pro release and benchmarks, Provisioning API keys for user management, OpenRouter activity log retention, GPT-4o image generation API support

Links mentioned:


Nous Research AI ▷ #general (148 messages🔥🔥):

Bits per Weight (BPW), Model Capacity Scaling, DeepSeek V3, Gemini 2.5 Pro, AI IDE Evaluation

Links mentioned:


Nous Research AI ▷ #ask-about-llms (26 messages🔥):

Add and Sigmoid vs Add and Norm, Scaling Experts at Inference Time, Transformers without Normalization, LLM-Emulated Raytracing, Indirect Image Generation with LLMs

Link mentioned: llmbenchmark/raytracer at master · cpldcpu/llmbenchmark: Various LLM Benchmarks. Contribute to cpldcpu/llmbenchmark development by creating an account on GitHub.


Latent Space ▷ #ai-general-chat (136 messages🔥🔥):

Reve Image, Qwen 2.5, ARC-AGI, 11x fraud, Zep knowledge graphs

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

swyxio: new talk/post from me: https://x.com/swyx/status/1904256213661192405


GPU MODE ▷ #general (7 messages):

Audio processing with ilgpu + cufft + kernels, Asynchronous data transfer to GPU with OnnxRuntime on CUDA, Double buffering with CUDA streams, FSDP fine tuning with trl library


GPU MODE ▷ #triton (9 messages🔥):

Triton Interpret Bug, Intel Triton Extension, Triton Compile Script, Prune Configs Support in Triton

Link mentioned: ML-Triton, A Multi-Level Compilation and Language Extension to Triton GPU Programming: In the era of LLMs, dense operations such as GEMM and MHA are critical components. These operations are well-suited for parallel execution using a tilebased approach. While traditional GPU programming...


GPU MODE ▷ #cuda (24 messages🔥):

CUDA swizzling, cuTensorMap issues, Flash Attention memory layout, Cutensor coordinate mapping

Link mentioned: Some question about creating CUtensorMap and use it: I have some questions about the following code. constexpr int row = 8; constexpr int col = 64; size_t byteSize = row * col * sizeof(int); int* h_data = (int*)malloc(byteSize); int*...


GPU MODE ▷ #torch (2 messages):

PyTorch allocator, torch.nn.utils.prune


GPU MODE ▷ #jobs (1 messages):

AMD GPU support in Triton, Job postings for Triton developers

Links mentioned:


GPU MODE ▷ #jax (1 messages):

bigfoot1144: Any progress so far?


GPU MODE ▷ #rocm (2 messages):

gpumode leaderboard, MI250 node, AMD Instinct MI250 evaluation


GPU MODE ▷ #tilelang (5 messages):

TileLang Compatibility with Torch AOTexport, TileLang compilation for AMD, Custom Triton Kernels


GPU MODE ▷ #sparsity-pruning (3 messages):

Iterative Pruning, Double Pruning Weights, Pruning Ratio Calculations


GPU MODE ▷ #liger-kernel (2 messages):

lce_forward_deprecated vs lce_forward


GPU MODE ▷ #metal (3 messages):

Open Source ML platform building


GPU MODE ▷ #self-promotion (5 messages):

fp16 MatMul for Gemma3, Gemma3 Residuals, CUDA Execution Time Benchmarks, Inferless on Product Hunt

Link mentioned: gemma3 fp16 fix by mobicham · Pull Request #36832 · huggingface/transformers: What does this PR do?Fixes float16 inference with Gemma 3 models by simply clipping the activations. The residual addition step should also be clipped for more accurate outputs. Without this fix, ...


GPU MODE ▷ #reasoning-gym (10 messages🔥):

ARC-AGI-2 Benchmark, Reasoning-Gym Puzzles, verL and vLLM0.8.1, Codegen Updates, RL Research Directions

Links mentioned:


GPU MODE ▷ #gpu模式 (1 messages):

Flash Attention Layout, Tensor Layout, Performance Optimization


GPU MODE ▷ #general (6 messages):

Conv2d compilation errors, CUDA compilation issues, PyTorch C++ extension problems, load_inline issues


GPU MODE ▷ #submissions (3 messages):

H100 benchmarks, T4 vectorsum, A100 grayscale


HuggingFace ▷ #general (40 messages🔥):

DeepSeek as moderation bot, Numerical RAG with Databricks, Fine-tuning open-source LLMs, VAD tool language agnostic, Hugging Face AgentX competition

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

ynvers256: Today I'm learning Renforcement learning and make research about Eureka


HuggingFace ▷ #cool-finds (1 messages):

Aider + Zed, Codium's Windsurf, TabNine, Cursor


HuggingFace ▷ #i-made-this (14 messages🔥):

Audio Extraction Tool in Rust, Achievement Token System, Music Generation System

Links mentioned:


HuggingFace ▷ #computer-vision (2 messages):

AutoCAD drawing generation, Metric scale at object location


HuggingFace ▷ #NLP (1 messages):

Unstructured Data to JSON Conversion, LLM Fine-tuning Datasets


HuggingFace ▷ #gradio-announcements (1 messages):

Gradio Deep Links, Gradio 5.23

Link mentioned: black-forest-labs/FLUX.1-schnell: no description found


HuggingFace ▷ #agents-course (4 messages):

Llama-3.2 and LlamaIndex.ai, Ollama setup, BAAI/bge-base-en-v1.5, Custom Tool help

Link mentioned: Starter Tutorial (Using Local LLMs) - LlamaIndex: no description found


MCP (Glama) ▷ #general (39 messages🔥):

New MCP Mod, Nexus context management system for AI coding assistants, Atom of Thoughts for Claude, Deepseek V3 with AOT, Running Multiple MCP Servers

Links mentioned:


MCP (Glama) ▷ #showcase (23 messages🔥):

Speech MCP, gotoHuman MCP Server, Apple MCP tools, VNC control via Claude

Links mentioned:


Notebook LM ▷ #use-cases (7 messages):

NotebookLM, Versatile Bot Project, Interactive mode, Chat Episode Prompt, Delivery pacing


Notebook LM ▷ #general (47 messages🔥):

Google Cloud Platform Billing, Workspace Data Export Tool, NotebookLM Plus subscription benefits, Mind Map Feature rollout, Multilingual Podcast

Link mentioned: Export your users' data - Google Workspace Admin Help: no description found


LM Studio ▷ #general (19 messages🔥):

Universal Translator, Mozilla's Transformer Lab, GPU Tokenization, Gemini 2.5 Pro

Link mentioned: GitHub - transformerlab/transformerlab-app: Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.: Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer. - transformerlab/transformerlab-app


LM Studio ▷ #hardware-discussion (23 messages🔥):

VRAM limits on GPUs, Docker container overhead with GPUs, 3090 Ti speed, M4 Max power consumption with 32B models, ROCm support for AMD GPUs


Cohere ▷ #「💬」general (17 messages🔥):

Data Retention, Security, Data Privacy, Data Usage Policy, Zero Data Retention (ZDR)

Links mentioned:


Cohere ▷ #「🔌」api-discussions (16 messages🔥):

Cohere API streaming, Cohere embedding generator, Cohere tokenization

Link mentioned: Chat with Streaming — Cohere: Generates a text response to a user message. To learn how to use the Chat API and RAG follow our Text Generation guides.Follow the Migration Guide for instructions on moving from API v1 to API v2.


Cohere ▷ #「🤖」bot-cmd (2 messages):

``


Cohere ▷ #「🤝」introductions (2 messages):

NLP project, Text summarization tool, Introduction of Sage


Torchtune ▷ #announcements (1 messages):

torchtune v0.6.0, Tensor Parallel, Phi 4, Multinode training


Torchtune ▷ #general (14 messages🔥):

DeepSeek-V3, Quantization Aware Training, MoEs in torchtune

Links mentioned:


Torchtune ▷ #dev (20 messages🔥):

CUDA overhead, Cursed Submodule, vLLM + GRPO, r1-zero


Nomic.ai (GPT4All) ▷ #general (31 messages🔥):

LocalDocs Backup, Privacy Considerations for Chat Data, Local LLM vs API for Message Processing, LocalDocs DB Import, Lost LocalDocs DB


LlamaIndex ▷ #blog (2 messages):

LlamaCloud MCP Server, Build an MCP server in Python


LlamaIndex ▷ #general (25 messages🔥):

Claude MCP support, Multi-agent performance with LlamaIndex, Agent types in LlamaIndex, Automatic LLM evaluations

Links mentioned:


Eleuther ▷ #general (4 messages):

AI IDE Evaluation, SAEs on CoTs, DeepSeek-V3-0324 showcase, Gemini 2.5 Pro

Links mentioned:


Eleuther ▷ #research (12 messages🔥):

SkyLadder short-to-long context window transition, Data-constrained pretraining for math, Composable Generalization

Links mentioned:


Eleuther ▷ #scaling-laws (1 messages):

Chinchilla Scaling Formula, Impact of Suboptimal Hyperparameters, Learning Rate Effects


Eleuther ▷ #multimodal-general (1 messages):

Self-organizing AI, AI Building Blocks


Eleuther ▷ #gpt-neox-dev (5 messages):

gpt-neox CI status, lm_eval upgrade

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (15 messages🔥):

Mojo for Website vs Rust, AES in Mojo Progress, SIMD for AES, Rust vs Go for Backend


Modular (Mojo 🔥) ▷ #max (2 messages):

CUDA-free, PTX for targeting NVIDIA GPUs


DSPy ▷ #general (9 messages🔥):

Text Summarization with DSPy, SIMBA Optimizer, Output Refinement vs Assertions, BestOfN Module, Refine Module

Link mentioned: Output Refinement - DSPy: The framework for programming—rather than prompting—language models.


tinygrad (George Hotz) ▷ #general (3 messages):

ROCm Support, OpenCL Front End, AMD GPUs with Tinygrad


Codeium (Windsurf) ▷ #announcements (1 messages):

Windsurf Creators Club, Vibe Coding Channel, Windsurf v1.5.8 Release

Link mentioned: no title found: no description found





{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}