Frozen AI News archive

not much happened today

**Gemini 2.5 Pro** shows strengths and weaknesses, notably lacking LaTex math rendering unlike **ChatGPT**, and scored **24.4%** on the **2025 US AMO**. **DeepSeek V3** ranks 8th and 12th on recent leaderboards. **Qwen 2.5** models have been integrated into the **PocketPal** app. Research from **Anthropic** reveals that **Chains-of-Thought (CoT)** reasoning is often unfaithful, especially on harder tasks, raising safety concerns. **OpenAI**'s **PaperBench** benchmark shows AI agents struggle with long-horizon planning, with **Claude 3.5 Sonnet** achieving only **21.0%** accuracy. **CodeAct** framework generalizes **ReAct** for dynamic code writing by agents. **LangChain** explains multi-agent handoffs in LangGraph. **Runway Gen-4** marks a new phase in media creation.

Canonical issue URL

AI News for 4/2/2025-4/3/2025. We checked 7 subreddits, 433 Twitters and 30 Discords (230 channels, and 5764 messages) for you. Estimated reading time saved (at 200wpm): 552 minutes. You can now tag @smol_ai for AINews discussions!

Devin cut prices, and the 1m token context window Qusar-Alpha might either be the new OpenAI open weights model or Meta's Llama 4, but neither seemed substantial enough to make title story.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Large Language Models (LLMs) and Model Performance

AI Tools, Frameworks, and Agent Development

Model Context Protocol (MCP)

AI and Education

AI and Geopolitics/Economics

Humor/Memes


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. "Advancements in AI Model Optimization and Evaluation"

Theme 2. "Exploring Enhancements in Gemma 3 Model Versions"

Theme 3. "Optimizing AI Models with GPU Servers and Insights"

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. "Navigating AI's Impact on Graphic Design Careers"

Theme 2. The Dual Edge of AI: Innovation and Anxiety


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: Model Mania - New Releases, Rivalries, and Benchmarks

Theme 2: Tooling Up - Platform Updates, Integrations, and User Workflows

Theme 3: Under the Hood - Technical Hurdles and Hardware Headaches

Theme 4: Framework Focus - MCP, Mojo, Torchtune & More

Theme 5: Community & Industry Buzz - Funding, Feedback, and Policy Fights


PART 1: High level Discord summaries

Manus.im Discord Discord


LMArena Discord


Cursor Community Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


Perplexity AI Discord


Interconnects (Nathan Lambert) Discord


aider (Paul Gauthier) Discord


LM Studio Discord


OpenRouter (Alex Atallah) Discord


HuggingFace Discord


MCP (Glama) Discord


Notebook LM Discord


Latent Space Discord


Yannick Kilcher Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


Torchtune Discord


Eleuther Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


Nous Research AI Discord


Cohere Discord


Nomic.ai (GPT4All) Discord


DSPy Discord


Gorilla LLM (Berkeley Function Calling) Discord


Codeium (Windsurf) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Manus.im Discord ▷ #showcase (1 messages):

liewxinyen: awesome case from <@356472623456059392> <:1741316509962:1348823230454038670> 🤩


Manus.im Discord ▷ #general (807 messages🔥🔥🔥):

Brazilian Lawyer using AI, ReferrerNation's BPO job matchmaking platform, Learning code with AI assistance, Claude for report writing, AI competition from China

Links mentioned:


LMArena ▷ #general (1010 messages🔥🔥🔥):

Meta vision model Cotton, Qwen2.5-vl-32b-instruct OCR, Google Gemini Models, Nightwhisper model on webdev, Gemini 2.6 Pro experimental

Links mentioned:


LMArena ▷ #announcements (1 messages):

Mobile Alpha UI, LM Arena Access, Alpha Feedback

Links mentioned:


Cursor Community ▷ #general (772 messages🔥🔥🔥):

Restoring to previous checkpoints, Roo code, Boomerang Mode, Gemini Pro EXP vs Pro MAX, Windsurf vs Cursor tab

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (217 messages🔥🔥):

ECC Errors in EC2 Instances, Gemma 3 Bug with Custom Dataset, Unsloth Apple Silicon Support, Fine-tuning LLaSA with Unsloth, RTX 5090 vs RTX 4090 for Unsloth

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (5 messages):

Job transition, Product Manager, Vibe Coder


Unsloth AI (Daniel Han) ▷ #help (236 messages🔥🔥):

Unsloth batch size, SFTTrainer Usage, Gemma3 finetuning, Qwen2.5 Image Size, GRPO and CPU Bottleneck

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (2 messages):

GRPO Trainer Implementation, Unsloth Techniques, Collab Notebook, DeepSpeed alternative

Link mentioned: documentation/Unsloth-GRPO.ipynb at 9.0 · xyehya/documentation: Odoo documentation sources. Contribute to xyehya/documentation development by creating an account on GitHub.


Unsloth AI (Daniel Han) ▷ #research (6 messages):

GRPO/PPO, Continue Pretraining Llama, Bespoke Labs new models


OpenAI ▷ #ai-discussions (103 messages🔥🔥):

Gemini vs Grok, Manus deceptive?, AI coding, OpenAI value for money

Links mentioned:


OpenAI ▷ #gpt-4-discussions (4 messages):

Livekit framework, GPT-4o tasks, Red team members supervision


OpenAI ▷ #prompt-engineering (130 messages🔥🔥):

AI Image Generation, Model Behavior, Content Policies vs Model Spec, Adult Content Generation

Link mentioned: OpenAI Model Spec: The Model Spec specifies desired behavior for the models underlying OpenAI's products (including our APIs).


OpenAI ▷ #api-discussions (130 messages🔥🔥):

Image generation with glow effects, Model Spec vs Content Policies, Adult content generation, Image Editing Improvements

Link mentioned: OpenAI Model Spec: The Model Spec specifies desired behavior for the models underlying OpenAI's products (including our APIs).


Perplexity AI ▷ #general (337 messages🔥🔥):

Perplexity Pulse Program, Deep Research Updates, Gemini 2.5 vs Perplexity O1, Android App Home Screen, LLM Jailbreaks

Links mentioned:


Perplexity AI ▷ #sharing (6 messages):

Shareable threads, Perplexity AI Search


Perplexity AI ▷ #pplx-api (1 messages):

Perplexity API versioning, API Versioning, Breaking Changes


Interconnects (Nathan Lambert) ▷ #news (177 messages🔥🔥):

OpenAI Nonprofit Commission Guidance, Github Copilot and OpenRouter Integration, Google rents Nvidia Blackwell chips from CoreWeave, Inference Scaling and the Log-x Chart, Runway Secures $300M in Series D Funding

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (10 messages🔥):

Joanne Jang, GPT-4o Transcribe, ChatGPT 4o ImageGen Watermark

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (40 messages🔥):

Devin 2.0, Agent-based IDEs, Windsurf vs Cursor, Claude-code API, Polars updates

Link mentioned: Tweet from Cognition (@cognition_labs): Introducing Devin 2.0: a new agent-native IDE experience.Generally available today starting at $20. 🧵👇


Interconnects (Nathan Lambert) ▷ #reads (53 messages🔥):

Distilling Reasoning Capabilities, Superhuman AI Impact Prediction, Algorithmic progress vs data progress, Dwarkesh AGI Forecast Podcast, Nvidia Open Code Reasoning Collection

Links mentioned:


Interconnects (Nathan Lambert) ▷ #expensive-queries (2 messages):

OpenAI Deep Research, Plumbing Repair Costs

Link mentioned: Tweet from Jim Bohnslav (@jbohnslav): Got a quote on a simple plumbing repair: $2,250. Ask OpenAI Deep Research for market rate: $300-$500. Ask DR for good plumbers in my area. Call the first one. Fixes it for $200. OpenAI Pro literally s...


aider (Paul Gauthier) ▷ #general (170 messages🔥🔥):

Gemini 2.5 Pro Rate Limits, Architect Mode Optimizations, Voice Command Configuration, MCPs for LSPs and Treesitter

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (14 messages🔥):

Aider Shell, Openrouter Errors, Git Repo Corrupted, Aider Print Prompt Costs, Gemini Comments


aider (Paul Gauthier) ▷ #links (12 messages🔥):

Refact Polyglot Claims, Aider Polyglot Benchmark, SWE-bench evaluation, OpenAI's PaperBench


LM Studio ▷ #general (120 messages🔥🔥):

LM Studio for Brave, System Prompt in Local Server, CUDA0 Buffer Allocation Failure, Q4 vs Q6 Model Quality, Dual GPU Setup with LM Studio

Links mentioned:


LM Studio ▷ #hardware-discussion (36 messages🔥):

Unsloth 2.0 6b performance, M3 Ultra vs M4 Max for LLMs, Macs for LLM Use, Qwen QWQ Quality, GPU vs Apple Silicon Benchmarks

Link mentioned: GitHub - XiongjieDai/GPU-Benchmarks-on-LLM-Inference: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference? - XiongjieDai/GPU-Benchmarks-on-LLM-Inference


OpenRouter (Alex Atallah) ▷ #announcements (49 messages🔥):

Web Search Citations in API, Quasar Alpha Stealth Model, Inference Net endpoints Disabled, Coding Optimized Models

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

AI character platform, charactergateway.com

Link mentioned: Character Gateway: AI Character API Platform for Developers


OpenRouter (Alex Atallah) ▷ #general (99 messages🔥🔥):

Gemini 2.5 Pro, Image responses, OpenAI Responses API, Targon Speed, Anthropic Blocking

Links mentioned:


HuggingFace ▷ #general (102 messages🔥🔥):

Paid frontier models in production, vLLM/TGI Setup with RTX 5000, GPU server costs, Counterfeit detection with VLMs, Chat templates in training

Links mentioned:


HuggingFace ▷ #today-im-learning (4 messages):

Hugging Face Token Setup, Jupyter Notebook Configuration, LlamaIndex Basics


HuggingFace ▷ #i-made-this (11 messages🔥):

Object Detection Model, End-to-End Project, Operating System Events to AI, Game about AI with AI, TypeScript Voice Assistant

Links mentioned:


HuggingFace ▷ #computer-vision (2 messages):

video-to-3D human mesh reconstruction repos, OWLv2's image-guided-detection mode issue

Link mentioned: Issue with OWLv2's image-guided-detection mode. · Issue #487 · NielsRogge/Transformers-Tutorials: I have tried endless times to recreate the results from the tutorial notebook of https://github.com/NielsRogge/Transformers-Tutorials/blob/master/OWLv2/Zero_and_one_shot_object_detection_with_OWLv2...


HuggingFace ▷ #smol-course (1 messages):

MLX model, Smolagent, AgentGenerationError


HuggingFace ▷ #agents-course (17 messages🔥):

Course Certification, Smart RAG agent, Gradio Version, Project goals


MCP (Glama) ▷ #general (103 messages🔥🔥):

Debugging MCPs, MCP File System Server, MCP Documentation, MCP Client Implementations, FastMCP vs Low Level

Links mentioned:


MCP (Glama) ▷ #showcase (10 messages🔥):

Enact Protocol, Shopify MCP, Mobile MCP Server, Semantic Tool Calling, External Registry Idea

Links mentioned:


Notebook LM ▷ #announcements (2 messages):

NotebookLM UX Research, Discover Sources Feature, Google AI summaries

Links mentioned:


Notebook LM ▷ #use-cases (4 messages):

Source file transferability, Podcast deep dives, Slideshow presentations


Notebook LM ▷ #general (98 messages🔥🔥):

NotebookLM 2.5 Pro, Gemini Integration with NotebookLM, Safari Access Issues, Source Transferability, Discover Sources Feature

Links mentioned:


Latent Space ▷ #ai-general-chat (70 messages🔥🔥):

Ace Computer Autopilot Launch, YourBench Open Source Benchmarking Tool, Model Context Protocol Memory Implementation, RabbitOS Intern, Llama 4 Image Generation

Links mentioned:


Latent Space ▷ #ai-announcements (4 messages):

June Ramp Up, Model Context Protocol (MCP), AI Engineer World's Fair 2025, MCP vs OpenAPI

Links mentioned:


Yannick Kilcher ▷ #general (52 messages🔥):

UX/UI Competition, AI UI Layout Generation, GPT-4o Behavior, GPT-5 Unified Model, DeepSeek Hype

Link mentioned: Delivery Web App Design: no description found


Yannick Kilcher ▷ #paper-discussion (4 messages):

LLMs struggle with math, LLMs overestimating themselves

Link mentioned: Reddit - The heart of the internet: no description found


Yannick Kilcher ▷ #ml-news (2 messages):

Gemini App, Dream 7B

Links mentioned:


GPU MODE ▷ #general (3 messages):

OpenAI /v1/chat/completions API, conversation history, /v1/responses API, stateful vs stateless APIs


GPU MODE ▷ #cuda (8 messages🔥):

cudaMemcpyAsync Overlap, cuBLAS matmul low occupancy, Registers in CUDA


GPU MODE ▷ #torch (1 messages):

LLM Profiling, PyTorch Profilers, Perfetto Crashing, Trace Processor


GPU MODE ▷ #cool-links (2 messages):

AMD talk on TunableOp, NVIDIA pre-tuning in CuBLAS, NVSHMEM-based kernels for MoE models

Links mentioned:


GPU MODE ▷ #beginner (3 messages):

Activation Checkpointing, CUDA Compilation, C vs C++ in CUDA


GPU MODE ▷ #torchao (3 messages):

FP8 Training, Optimizer Configuration, Model Size Impact, torch.compile Usage, GEMM size requirements

Links mentioned:


GPU MODE ▷ #rocm (3 messages):

Code Correctness Issues, Assembly Differences


GPU MODE ▷ #thunderkittens (2 messages):

Blackwell Architecture, ThunderKittens Kernels, CTA pairs on Blackwell

Link mentioned: ThunderKittens Now on Blackwells!: no description found


GPU MODE ▷ #reasoning-gym (7 messages):

Datasets, Curricula, RGBench, Knight Swap, Puzzle2

Links mentioned:


GPU MODE ▷ #submissions (4 messages):

Grayscale Leaderboard Submissions, Modal Runners Success


Modular (Mojo 🔥) ▷ #mojo (28 messages🔥):

Quantity struct, Dimensions ** power, IntLiteral vodoo XD, normlisation, Python wrappers for Mojo

Links mentioned:


Torchtune ▷ #general (4 messages):

Checkpoint Conversions, HF Checkpoint Format, tune_to_hf function


Torchtune ▷ #dev (19 messages🔥):

vLLM memory sharing with Unsloth, GRPO Upstream contributions, Torchtune hanging with certain sequence lengths, Packed Datasets

Link mentioned: Chunked output causes timeout crash on certain seq len · Issue #2554 · pytorch/torchtune: TL;DR If one of dataloader batches is 49 tokens long, torchtune crashes on timeout Longer explanation chunked_output in transformer.py splits output into a list of 8 tensors. If output of length 49...


Torchtune ▷ #papers (4 messages):

Dream 7B, Diffusion Language Models, Huawei Noah’s Ark Lab

Link mentioned: Dream 7B | HKU NLP Group : no description found


Eleuther ▷ #general (14 messages🔥):

Diagram Creation Tools, DeTikZify, Gradient Accumulation, GitHub MCP event

Links mentioned:


Eleuther ▷ #research (1 messages):

OpenThoughts-1M, OpenThinker2-32B, OpenThinker2-7B, R1-Distilled-32B, Qwen 2.5 32B

Links mentioned:


Eleuther ▷ #interpretability-general (9 messages🔥):

Combining linear probes, Steering vector composition, Contrastive sample selection

Links mentioned:


tinygrad (George Hotz) ▷ #general (15 messages🔥):

Google Mentorship, Tinygrad YoloV8 on Android, LeetGPU support for tinygrad

Link mentioned: LeetGPU: no description found


tinygrad (George Hotz) ▷ #learn-tinygrad (7 messages):

bilinear interpolation, saving latest model


LlamaIndex ▷ #blog (1 messages):

CodeAct Agents, ReAct Generalization


LlamaIndex ▷ #general (20 messages🔥):

Rankify framework, Enhance Gemini API Integrations, Cursor API knowledge, otel trace_id, Re-index a file in postgres

Link mentioned: GitHub - DataScienceUIBK/Rankify: 🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techniques, 24+ state-of-the-art Reranking models, and multiple RAG methods.: 🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techn....


Nous Research AI ▷ #general (11 messages🔥):

ChatGPT 4o Magic The Gathering Cards, Runway Gen 4, Alibaba Wan 2.2

Links mentioned:


Nous Research AI ▷ #ask-about-llms (3 messages):

LLMs for extraction, Genstruct-7B, Ada-Instruct

Links mentioned:


Nous Research AI ▷ #interesting-links (1 messages):

OpenAPI, SaaS, PaaS, IaaS, LLMs

Link mentioned: no title found: no description found


Cohere ▷ #「💬」general (8 messages🔥):

Cohere Status Page, Python logging vs print statements, RAG strategy for documents

Link mentioned: Cohere Status Page Status: Latest service status for Cohere Status Page


Cohere ▷ #「💡」projects (1 messages):

AI Safety Testing Platform, Bias and Harmful Outputs, AI Model Deployment Challenges

Link mentioned: Brainstorm - AI Safety Made Easy: The simple solution to AI safety testing.


Cohere ▷ #「🤝」introductions (2 messages):

KAIST student, Bias/fairness and interpretability in LLMs/VLMs, Research collaboration opportunities


Nomic.ai (GPT4All) ▷ #general (7 messages):

Nomic Embed Text V2, Vulnerability Disclosure, GPT4All-J model, Chocolatine-2-14B model, Chat Reorganization

Link mentioned: Contact Nomic Sales: Explore, analyze and build with your unstructured data


DSPy ▷ #general (5 messages):

LLM agent development, DSPy Framework, OpenAI Agents SDK, Prompt Engineering vs programming


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (3 messages):

Tool evaluation, Phi-4-mini-instruct, BFCL

Link mentioned: [BFCL] add support for microsoft/Phi-4-mini-instruct by RobotSail · Pull Request #967 · ShishirPatil/gorilla: This PR introduces support for the newly-released Phi-4-mini-instruct model from Microsoft:Phi-4-mini-instructThe results for this were initially evaluated against f81063; however, the model ha...


Codeium (Windsurf) ▷ #announcements (1 messages):

DeepSeek-V3, DeepSeek-V3-0324, Windsurf AI

Link mentioned: Tweet from Windsurf (@windsurf_ai): DeepSeek-V3 has now been upgraded to DeepSeek-V3-0324. It's still free!




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}