Frozen AI News archive

GPT 4.1: The New OpenAI Workhorse

**OpenAI** released **GPT-4.1**, including **GPT-4.1 mini** and **GPT-4.1 nano**, highlighting improvements in **coding**, **instruction following**, and handling **long contexts** up to **1 million tokens**. The model achieves a **54 score on SWE-bench verified** and shows a **60% improvement over GPT-4o** on internal benchmarks. Pricing for **GPT-4.1 nano** is notably low at **$0.10/1M input** and **$0.40/1M output**. **GPT-4.5 Preview** is being deprecated in favor of **GPT-4.1**. Integration support includes **Llama Index** with day 0 support. Some negative feedback was noted for **GPT-4.1 nano**. Additionally, **Perplexity's Sonar API** ties with **Gemini-2.5 Pro** for the top spot in the LM Search Arena leaderboard. New benchmarks like **MRCR** and **GraphWalks** were introduced alongside updated prompting guides and cookbooks.

Canonical issue URL

AI News for 4/11/2025-4/14/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (211 channels, and 16961 messages) for you. Estimated reading time saved (at 200wpm): 1382 minutes. You can now tag @smol_ai for AINews discussions!

GPT 4.1 links:

and a new interview published on Latent Space:

https://youtu.be/y__VY7I0dzU


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

GPT-4.1 Release and Performance

Model Benchmarks and Comparisons

Robotics and Embodied AI

AI Research and Papers

Other Model and AI Tool Releases

AI Infrastructure and Tooling

AI Strategy and Discussion

Humor and Miscellaneous


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. "Exciting Advancements in GLM-4 Reinforcement Learning Models"

Theme 2. "DeepSeek's Open-Source Contributions to AI Inference"

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. "Revolutionizing Science: OpenAI's New Reasoning Models"

Theme 2. "Exciting AI Model Innovations and Competitive Updates"


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. GPT-4.1 Models: Release, Performance, and Availability

Theme 2. Gemini 2.5 Pro: Performance Swings and Pricing Shifts

Theme 3. Open Source Models and Tools Gain Momentum

Theme 4. Hardware Optimization and CUDA Deep Dives

Theme 5. Agent Development and Tooling Ecosystem Evolves


PART 1: High level Discord summaries

Perplexity AI Discord


LMArena Discord


aider (Paul Gauthier) Discord


OpenRouter (Alex Atallah) Discord


Manus.im Discord Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


Cursor Community Discord


LM Studio Discord


Yannick Kilcher Discord


HuggingFace Discord


Eleuther Discord


Latent Space Discord


Notebook LM Discord


Nous Research AI Discord


MCP (Glama) Discord


GPU MODE Discord


Nomic.ai (GPT4All) Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Torchtune Discord


Cohere Discord


LLM Agents (Berkeley MOOC) Discord


DSPy Discord


MLOps @Chipro Discord


Codeium (Windsurf) Discord


Gorilla LLM (Berkeley Function Calling) Discord


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Perplexity AI ▷ #announcements (2 messages):

Android Draw to Search, Champions League on Perplexity, Voice Search, Box and Dropbox Connectors, Perplexity Finance Time Comparison


Perplexity AI ▷ #general (1237 messages🔥🔥🔥):

Fake Play Button, Wizard reminds me of StableDiffusion, Automate Dating, What Model to Pick


Perplexity AI ▷ #sharing (7 messages):

Prompt Engineering, Death and Taxes, Tourist blowing, Whatsapp Priorities


Perplexity AI ▷ #pplx-api (5 messages):

Perplexity Livestream Recording, ComfyUI Integration for Perplexity, Perplexity API Social Toggle


LMArena ▷ #general (1347 messages🔥🔥🔥):

Gemini 2.5 Pro Nerfed, Windsurf AI, RooCode coding IDE, GPT-4.1 Analysis, Nightwhisper vs Dragontail


aider (Paul Gauthier) ▷ #announcements (3 messages):

Aider v0.82.0 Release, GPT 4.1 support, Architect mode with Gemini, Fireworks AI model deepseek-v3-0324, OpenRouter Alpha endpoints retirement


aider (Paul Gauthier) ▷ #general (1113 messages🔥🔥🔥):

Off-topic channel debate, Air filter discussion, GPT-4.1 and Aider, Gemini 2.5 vs Claude 3.7, MCP implementation


aider (Paul Gauthier) ▷ #questions-and-tips (107 messages🔥🔥):

unintuitive restore chat history, Basic Authentication Header using OpenAI compatible API, GPTs Agent, Model Merging, Open Empathic


aider (Paul Gauthier) ▷ #links (6 messages):

Prompt Engineering, Aider Efficiency, GPT-4.1 Predictions, Prompting Guide


OpenRouter (Alex Atallah) ▷ #announcements (65 messages🔥🔥):

Gemini Pricing Update, OpenRouter Free Models, GPT-4.1 Models, Stealth Model Reveal


OpenRouter (Alex Atallah) ▷ #general (910 messages🔥🔥🔥):

GPT-4.1, Gemini 2.5, Optimus-Alpha, DeepSeek, Rate Limits


Manus.im Discord ▷ #showcase (3 messages):

PDF to Website Transfer, Learning Website Creation


Manus.im Discord ▷ #general (1020 messages🔥🔥🔥):

DeepSeekV3 vs Manus, Bionic CyberSecurity, Firebase or GCP, Gemini 2.5 Pro, Open Source AI


Unsloth AI (Daniel Han) ▷ #general (820 messages🔥🔥🔥):

Gemma 4B vs 1B for GRPO, Unsloth AMD support, Triton rewrite of transformer, Lightning AI vs Notebooks, GPT-4.1 minor improvements


Unsloth AI (Daniel Han) ▷ #off-topic (113 messages🔥🔥):

Gemma 3 27b Memory Layers, LM2 Memory Units, Hardware Requirements, Frontend Development, Code Extraction from AI Tools


Unsloth AI (Daniel Han) ▷ #help (185 messages🔥🔥):

PCIe Slot type effect on Training Performance, Orpheus TTS Finetuning, Runpod Sync with Unsloth, Gemma-3-1b-it fine-tuning with GRPO and Unsloth, Llama 4 Scout model inference with 4-bit quantization


Unsloth AI (Daniel Han) ▷ #showcase (5 messages):

Qwen 3B, GRPO, Multi-turn, Tool Calling, Code Execution


Unsloth AI (Daniel Han) ▷ #research (17 messages🔥):

LLM Compression, Higgs vs exl3, Data Centers Access to Models, Apple's Cut Cross Entropy


OpenAI ▷ #annnouncements (2 messages):

GPT-4.1 API, OpenAI Livestream


OpenAI ▷ #ai-discussions (642 messages🔥🔥🔥):

Veo 2, Sora, Gemini, OpenAI Guardrails, GPT-4o Empathetic


OpenAI ▷ #gpt-4-discussions (40 messages🔥):

OpenAI Memory FAQ, Synthetic Cognition Engine, Comprehensive Chat Summarization Prompt, GPT Image Generation Issues, Custom GPTs and External APIs


OpenAI ▷ #prompt-engineering (22 messages🔥):

Image Generation Smudging, Font Control in Image Generation, Sora Camera Control, JSON Schema Date Manipulation, NSFW Content Generation


OpenAI ▷ #api-discussions (22 messages🔥):

Image generation smudge removal, Sora camera control, Date of birth JSON schema, Font control in image generation, NSFW language from the model


Cursor Community ▷ #general (696 messages🔥🔥🔥):

OpenAI model release, DeepSeek Logic/Math, Claude 3, Thinking models, Cursor Context window


LM Studio ▷ #general (276 messages🔥🔥):

Speculative Decoding, lmstudio-js & LangChain, Gemma 3 Models uncensored


LM Studio ▷ #hardware-discussion (242 messages🔥🔥):

Threadripper vs Xeon, DDR5 RAM Impact, GPU Offloading, ROCm vs CUDA, KV Cache Quantization


Yannick Kilcher ▷ #general (478 messages🔥🔥🔥):

Probabilistic Finite-State Automata (FSA), Scaling Limitations/Obstacles, RL-based approaches, Training GPTs Agent, User Interface Changes on Platform


Yannick Kilcher ▷ #paper-discussion (2 messages):

Hugging Face Ultra-Scaling Playbook Review


Yannick Kilcher ▷ #agents (10 messages🔥):

Web Search Agent, Open Source Scraping, Vertex AI Agent Builder, Brave Search API, SwissKnife


Yannick Kilcher ▷ #ml-news (23 messages🔥):

OpenAI recruiting video, Solomonoff's theory, Gen AI Use Case Report, Character.AI user base, GPT-4.5 being a talking model


HuggingFace ▷ #announcements (1 messages):

Llama 4 Maverick and Scout, SmolVLM, Diffusers 0.33.0, AI Agents Sustainability, Arabic Leaderboards


HuggingFace ▷ #general (360 messages🔥🔥):

Robotics Simulation Roadmap, LibreChat Duplication Issues, Ollama Syllabi Tool for Curriculum Generation, Parquet Files to Hugging Face, MLX Eagle Speculative Decoding


HuggingFace ▷ #today-im-learning (37 messages🔥):

Ollama Agent Roll Cage, ML Guidance, Implementation from Scratch, KrishNaik and freecodecamp, Deep Learning Specialization


HuggingFace ▷ #cool-finds (2 messages):

TLDR Service


HuggingFace ▷ #i-made-this (20 messages🔥):

Universal Intelligence protocols released, Speaker Isolation Toolkit, MLX EAGLE-2 Speculative Decoding, gpu-spaces script, SwissKnife request for comment


HuggingFace ▷ #reading-group (2 messages):

Society of Minds framework


HuggingFace ▷ #computer-vision (2 messages):

rf-detr-uslsohoy, CV Hangout


HuggingFace ▷ #NLP (2 messages):

Local LLM Models, OlympicCoder 7B and 32B, DeepSeek R1 Distil Qwen 7B, DeepCoder 14B Preview, fine tuning facebook nllb language translator model


HuggingFace ▷ #smol-course (7 messages):

Agent course deadline, Agent course use cases


HuggingFace ▷ #agents-course (41 messages🔥):

Course Certification, HF API Limit Issues, Ollama Setup and Usage, LLM Fine-Tuning for Agents, SmolAgent's Pope Obsession


Eleuther ▷ #general (54 messages🔥):

Model Similarity Analysis, Dataloader batching strategies, Multiple token prediction, Input-dependent LoRAs, Stereoisomer encoding for chemical features


Eleuther ▷ #research (236 messages🔥🔥):

arXiv endorsement request, AI-generated content in research, Token loss and length extrapolation, Visual autoregressive models for microscopy images, Policy enforcement


Eleuther ▷ #interpretability-general (14 messages🔥):

Graph attribution mechanistic interpretability, Distillation effects on model circuits, Models' knowledge of their circuits, Reasoning model self-awareness, CoT fidelity in reasoning models


Latent Space ▷ #ai-general-chat (99 messages🔥🔥):

Karpathy asks ChatGPT embarrassing questions, Thinking Machines $2B round, OpenAI SWE coming, GPT 4.1 Quasar launch, DeepSeek Inference Engine Open Source


Latent Space ▷ #ai-announcements (2 messages):

Quasar launch, SFCompute pod


Latent Space ▷ #llm-paper-club-west (5 messages):

X-Ware.v0, AI news source


Latent Space ▷ #ai-in-action-club (186 messages🔥🔥):

Agent Definitions, Langsmith Tool, Visibility into training process, Model Benchmarking Tools, GPT-4.1 launch


Notebook LM ▷ #use-cases (18 messages🔥):

Non-deterministic NLM, NLM in education, Gemini Education Workspace, Conversational interface, NotebookLM for University


Notebook LM ▷ #general (141 messages🔥🔥):

NotebookLM for students, Google Agents and NotebookLM, Notebook search function, Discover feature in NotebookLM, Deep research problems in Gemini


Nous Research AI ▷ #general (109 messages🔥🔥):

Llama 4 Maverick & Scout, DeepCoder Model, Nvidia UltraLong Models, GPT-4.1 Pricing & Performance, Gemini 2.5 Pro


Nous Research AI ▷ #ask-about-llms (15 messages🔥):

Loss Observations on H100 Llama 4 Scout, Small Model Training Challenges, Dataset Recommendations for Small Models, Surya and SmolVLM2


Nous Research AI ▷ #interesting-links (1 messages):

ee.dd: https://ai-2027.com


Nous Research AI ▷ #reasoning-tasks (14 messages🔥):

Research Paper on Repo, Task Quality Assurance


MCP (Glama) ▷ #general (99 messages🔥🔥):

MCP for Reddit and Quora, Paid bounty for MCP server setup, ADK and A2A vs MCP, Exposing tools to the user, Passing tools to the LLM


MCP (Glama) ▷ #showcase (36 messages🔥):

Models without Function Calling, MCP Bug Spotting Tools, Paprika Recipe MCP Server, Oterm Release and MCP Sampling, AutoMCP for Agent Deployment


GPU MODE ▷ #general (18 messages🔥):

CUDA in Python/PyTorch, AMD GPU Mode Competition, GTC talk by marksaroufim, Stephen Jones videos, channel owner


GPU MODE ▷ #triton (5 messages):

Morton Order vs Swizzle2D, Space-Filling Curves, Hilbert Curves vs Morton Ordering, Debugging Triton Memory Leaks, Implementing Triton Kernel


GPU MODE ▷ #cuda (8 messages🔥):

Dynamic KV cache tensors in CUDA, cuBLAS Batched GEMM, memcpy_async cooperative API, Async copies and uncoalesced global access, Shared memory alignment


GPU MODE ▷ #torch (6 messages):

Memory profiling distributed training, ATen attention.cu, torchscript jit CUDA optimizations, ZeRo Stage 3 PyTorch Lightning tutorial


GPU MODE ▷ #algorithms (1 messages):

RMSNorm vs L2 Norm, Llama Norm, Scout Embeddings


GPU MODE ▷ #beginner (18 messages🔥):

CUDA events, Maxwell tuning guide, shared memory, PTX and SASS, LOP3.LUT


GPU MODE ▷ #torchao (3 messages):

QLoRA Training, 4bit quantization, QAT for all layers in a model


GPU MODE ▷ #off-topic (1 messages):

iron_bound: https://core-math.gitlabpages.inria.fr/


GPU MODE ▷ #rocm (8 messages🔥):

AMD GPUs, Cloud providers, Profiling, vast.ai, shadeform


GPU MODE ▷ #metal (7 messages):

Profiling Metal Kernels, Naive vs. Coalesced Matrix Multiplication, Memory Usage Differences, Unified Memory and Paging on M-Series Chips


GPU MODE ▷ #self-promotion (11 messages🔥):

OptiLLM inference proxy, Fast Prefix Sum, Thread Coarsening, SwissKnife webgpu graphrag


GPU MODE ▷ #🍿 (2 messages):

LLM for Kernel Code Generation, RL for Kernel Optimization


GPU MODE ▷ #submissions (7 messages):

vectoradd, grayscale, Modal runners


GPU MODE ▷ #ppc (1 messages):

eriks.0595: <@349565795711451146> we've updated the grader, can you let me know if this is fixed?


GPU MODE ▷ #feature-requests-and-bugs (6 messages):

Python vs CUDA Submissions, Auto-Wrapping CUDA Files, Profiling Tools, QoL Changes


GPU MODE ▷ #amd-competition (8 messages🔥):

Challenge registration, Discord ID submission, Confirmation email after registration


Nomic.ai (GPT4All) ▷ #general (67 messages🔥🔥):

Nomic Embeddings, GPT4All Max Tokens, HuggingFace story models, Chat Templates, Context Length


Modular (Mojo 🔥) ▷ #general (16 messages🔥):

Mojo ownership vs Rust, Origins vs Lifetimes, VSCode extension issues, Mojmelo module, closures


Modular (Mojo 🔥) ▷ #mojo (32 messages🔥):

PythonObject Literal, MLIR in Mojo, Mojo Proposals, Negative Bounds


LlamaIndex ▷ #blog (5 messages):

Llama4 Deep Research, Equity Research Agent, GPT-4.1 API, Agent Benchmarks


LlamaIndex ▷ #general (31 messages🔥):

LlamaParse vs SimpleDirectoryReader, Files in Index vs External File Sources, Open Source LLMs for Agent Workflow, Django Application hangs when calling LlamaParser with Celery, Voice Agents Support


LlamaIndex ▷ #ai-discussion (5 messages):

.query has no history, LlamaParse Layout Agent Mode, Benchmarking AI evaluation models


tinygrad (George Hotz) ▷ #general (8 messages🔥):

NVIDIA Video Codec SDK, Direct Programming, Meeting #66 Topics, Index Validation PR


tinygrad (George Hotz) ▷ #learn-tinygrad (26 messages🔥):

clang flags, tinygrad notes, debugging NaNs, small bounty


Torchtune ▷ #general (16 messages🔥):

Custom TorchTune model in vLLM, HF model, Custom model architecture in vLLM, Torchtune generate script


Torchtune ▷ #dev (8 messages🔥):

bitsandbytes installation errors, macOS installation issues, unit tests on macOS, FSDP import error, platform specific requirements


Torchtune ▷ #papers (2 messages):

QLoRA, Quantization, Sub-4-Bit Quantization


Torchtune ▷ #rl (5 messages):

Reward Function Design, Loss Function Variety, Inference Provider Flexibility, Resource Allocation, TRL Success Logging


Cohere ▷ #「💬」general (9 messages🔥):

Coral Chat in Firefox, LLM Token Generation Issues


Cohere ▷ #「🔌」api-discussions (4 messages):

Cohere Chat API, Java Demo Code, command-a-03-2025 model


Cohere ▷ #「💡」projects (1 messages):

Diofanti.org, Aya model, Government spending transparency


Cohere ▷ #「🤝」introductions (3 messages):

LUWA.app, AI for Science Community


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Lambda, HuggingFace, Groq, Mistral AI, Google AI Studio


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

Sean Welleck, LeanHammer, AI proof development, formal reasoning


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (2 messages):

Lecture Schedule, Email Notifications


DSPy ▷ #general (2 messages):

AI Agent Developer, DSPy Modules


MLOps @Chipro ▷ #events (1 messages):

MCP, AWS, Model Context Protocol, Simba Khadder


MLOps @Chipro ▷ #general-ml (1 messages):

basit5750: I already have it dm me for Source Code


Codeium (Windsurf) ▷ #announcements (2 messages):

GPT-4.1, Free Usage, Discounted Rate, New Default Model, Limited-Time Opportunity


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

Multi-turn composite column removal, Dataset composition discrepancy


{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}