Frozen AI News archive

not much happened today

**OpenAI** launched both **Reinforcement Finetuning** and **Deep Research on GitHub repos**, drawing comparisons to **Cognition's DeepWiki**. **Nvidia** open-sourced **Open Code Reasoning models (32B, 14B, 7B)** with Apache 2.0 license, showing 30% better token efficiency and compatibility with llama.cpp, vLLM, transformers, and TGI. Independent evaluations highlight **Mistral Medium 3** rivaling **Llama 4 Maverick**, **Gemini 2.0 Flash**, and **Claude 3.7 Sonnet** in coding and math reasoning, priced significantly lower but no longer open-source. **Google's Gemini 2.5 Pro** is noted as their most intelligent model with improved coding from simple prompts, while **Gemini 2.5 Flash** incurs a 150x cost increase over Gemini 2.0 Flash due to higher token usage and cost. The **Absolute Zero Reasoner (AZR)** achieves SOTA performance in coding and math reasoning via reinforced self-play without external data. Vision-language model **X-REASONER** is post-trained on general-domain text for reasoning. **Apple ML research** released **FastVLM** with on-device iPhone demo. **HiDream LoRA trainer** supports QLoRA fine-tuning under memory constraints. **Nvidia's Parakeet ASR model** tops Hugging Face ASR leaderboard with MLX implementation. New datasets **SwallowCode** and **SwallowMath** boost LLM performance in math and code. Overall, a quiet day with significant model releases and performance insights.

Canonical issue URL

a quiet day.

AI News for 5/7/2025-5/8/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (215 channels, and 3981 messages) for you. Estimated reading time saved (at 200wpm): 396 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

OpenAI launched both Reinforcement Finetuning and Deep Research on GitHub repos, which many are comparing to Cognition's DeepWiki.

But it is a quiet day otherwise.


AI Twitter Recap

Models, Benchmarks, and Performance

Tools and Frameworks

AI Agents and Robotics

AI Education, Research and Investment

Industry and Business

Humor/Memes


AI Reddit Recap

/r/LocalLlama Recap

1. Qwen3-30B-A3B Quantization Benchmark Comparisons

2. NVIDIA OpenCodeReasoning Nemotron Model Launches

3. Best Practices in Building Reliable LLM Workflows

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. AI Industry Leadership Changes and Predictions

2. Generative AI Agents and Their Expanding Capabilities

3. New AI Model and Tool Announcements


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: Model Mania: Performance Peaks, Puzzling Personalities, and Popularity Contests

Theme 2: Tooling Upgrades & User Experiences: New Features, Frustrations, and Fixes

Theme 3: Hardware & Kernels: GPU Optimizations, Benchmarks, and Low-Level Crafting

Theme 4: API Antics: New Endpoints, Costly Calls, and Integration Quirks

Theme 5: Advanced Techniques, Research Frontiers, and Community Buzz


Discord: High level Discord summaries

Unsloth AI (Daniel Han) Discord


LMArena Discord


Perplexity AI Discord


Cursor Community Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


aider (Paul Gauthier) Discord


GPU MODE Discord


LM Studio Discord


Manus.im Discord Discord


HuggingFace Discord


MCP (Glama) Discord


Nous Research AI Discord


Yannick Kilcher Discord


Eleuther Discord


Notebook LM Discord


Latent Space Discord


Modular (Mojo 🔥) Discord


DSPy Discord


Cohere Discord


Torchtune Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


LLM Agents (Berkeley MOOC) Discord


Codeium (Windsurf) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Nomic.ai (GPT4All) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (554 messages🔥🔥🔥):

Qwen3-14B, Mistral vs Gemma vs Phi-4, AMD GPU, Model quantization


Unsloth AI (Daniel Han) ▷ #off-topic (22 messages🔥):

AI Project Hiring, LLM for text punctuation, LLM Recommendations, Qwen vs Gemma3 Model, IBM Granite 4.0 Mamba Model


Unsloth AI (Daniel Han) ▷ #help (92 messages🔥🔥):

phi4-mini-instruct training issues, Qwen3 model compatibility with vLLM, Kaggle notebook using multiple GPUs, Tokenizer configuration differences, Qwen3 model not thinking


Unsloth AI (Daniel Han) ▷ #research (24 messages🔥):

Gemma3 27b hooking input/output layers, Process Reward Model (PRM) training challenges, Finetuning Audio Understanding Models, DeepSeek-R1 vs other reasoning models for COT reasoning


LMArena ▷ #general (642 messages🔥🔥🔥):

Grok 3.5 release, Grok 3.5 never comings, EMBERWING model, LLM and Politics, Gemini 2.5 pro nerf


Perplexity AI ▷ #announcements (1 messages):

Perplexity AI, Reddit AMA, Deep Research, Live Q&A


Perplexity AI ▷ #general (568 messages🔥🔥🔥):

Stripe Customer Login, Attachment Support, Code Copy Button, Continuing Code, Gemini 2.5 Pro vs Claude


Perplexity AI ▷ #pplx-api (5 messages):

Sonar API response, Perplexity API


Cursor Community ▷ #general (415 messages🔥🔥🔥):

Cursor Pro Fast Prompts, MCPs not being called, Gemini model quality, Student discount problems, Discord community value


OpenAI ▷ #ai-discussions (141 messages🔥🔥):

GPT-4o Personality, Gemini vs GPT, Grok 3.5, OpenAI's Image Generator API Cost, AI Model Benchmarks


OpenAI ▷ #gpt-4-discussions (8 messages🔥):

Placebo Upvote Buttons, Discord Bot Stagnation


OpenAI ▷ #prompt-engineering (59 messages🔥🔥):

Custom GPT Creation, HyperTree prompting, Trihydrogen, Atomic Theory Book


OpenAI ▷ #api-discussions (59 messages🔥🔥):

Custom GPT creation tips, atomic theory book using chat gpt features, hypertree planning prompting, Trihydrogen existence, Arc Encoding Shapes


OpenRouter (Alex Atallah) ▷ #announcements (5 messages):

Activity Export Feature, CSV Export, Data Truncation Request


OpenRouter (Alex Atallah) ▷ #app-showcase (2 messages):

local proxy to fwd requests to openrouter, completions extend out of the mouse cursor


OpenRouter (Alex Atallah) ▷ #general (260 messages🔥🔥):

OlympicCoder 32B Availability, OpenRouter API Cost Retrieval, OpenRouter API Outage, OpenRouter Image Prompt Support, Gemini Free Version on OpenRouter


aider (Paul Gauthier) ▷ #general (149 messages🔥🔥):

Gemini 2.5 Pro Exp, Copilot Proxy, Aider web search, Aider use mcpm-proxy, Gemini models


aider (Paul Gauthier) ▷ #questions-and-tips (36 messages🔥):

Claude CLI vs Aider cost, Aider with web search, Perplexity API with Aider, aider-desk with search MCP, Aider repomaps


GPU MODE ▷ #general (1 messages):

tilelang, DSL for GPU/CPU kernels


GPU MODE ▷ #triton (17 messages🔥):

Atomic addition and non-determinism, fp16 vs bfp16 sensitivity, Triton kernel helper function


GPU MODE ▷ #cuda (12 messages🔥):

GMEM tensor data copy to SMEM, Decltype errors with make_tensor, Vast.ai data security, Project algorithms use same data from text file


GPU MODE ▷ #torch (1 messages):

Torch Compile Overhead, Kernel Fusion Benchmarking, A100 Performance Tuning


GPU MODE ▷ #announcements (1 messages):

New Working Group, Agentic Systems Optimization, Open Eval Task


GPU MODE ▷ #beginner (19 messages🔥):

Tiled Reduction Auto-tuning, PyTorch Internals Guide, Mojo vs CUDA for AI Compute


GPU MODE ▷ #torchao (2 messages):

Release Date for 0.11, New Features in 0.11


GPU MODE ▷ #off-topic (2 messages):

Speed of light in fiber, Networking Distance, Chip performance


GPU MODE ▷ #irl-meetup (1 messages):

random.oof: Anyone at the vllm meet up in nyc?


GPU MODE ▷ #rocm (1 messages):

Tilelang, Docker container support, Nightly Iterations


GPU MODE ▷ #liger-kernel (1 messages):

chiwanpark: I've sent a PR for Qwen 3 MoE models. https://github.com/linkedin/Liger-Kernel/pull/706


GPU MODE ▷ #self-promotion (2 messages):

PTX MMA Programming, NVIDIA Tensor Cores, Float8 Datatype, SASS Machine Code, H100 QMMA vs QGMMA


GPU MODE ▷ #submissions (54 messages🔥):

MI300, amd-fp8-mm, amd-mixture-of-experts, leaderboard submissions


GPU MODE ▷ #factorio-learning-env (45 messages🔥):

Steam Cloud Reinstallation, FLE Agent Integration, Docker File Issue, PR Import Bugs, Factorio Performance Issues


GPU MODE ▷ #amd-competition (6 messages):

MOE Leaderboard CLI, CLI Mean Time Output, GPU Access Heuristic


GPU MODE ▷ #cutlass (7 messages):

CUTLASS DistributedGEMM integration, Compact GMEM layout, TMA Load with packed layout


GPU MODE ▷ #mojo (2 messages):

Modular GPU Kernel Hackathon, AGI House, Dylan Patel


LM Studio ▷ #general (110 messages🔥🔥):

AnythingLLM with LM Studio Errors, CORS enabling, Rewriting SQL database code to pure graph, Gemini changing code, Qwen vs Gemini


LM Studio ▷ #hardware-discussion (31 messages🔥):

AMD 3D V-Cache benchmark, Mac studio m2 ultra, Intel Data Center GPU Max, swappa.com, AMD D700


Manus.im Discord ▷ #general (133 messages🔥🔥):

Cringe definition, Manus launch date, Manus credit costs, AI tools for scrapping businesses on Google Maps, Manus LLM source


HuggingFace ▷ #general (57 messages🔥🔥):

GSoC, HF dev environment, AI agent course, Face detection model in Inference API, Cleaning HF repo


HuggingFace ▷ #i-made-this (11 messages🔥):

ACE-STEP SOTA, Alpha-Root, Entropy engine tests, AI Billing Dashboard, UQLM


HuggingFace ▷ #computer-vision (4 messages):

FlashAttention, OCR for Newspaper Data


HuggingFace ▷ #NLP (2 messages):

Dropwise module release, Emotion classification model questions, Token max length understanding, Production deployment of HF models


HuggingFace ▷ #agents-course (18 messages🔥):

Agent Testing File, Final Project Metadata, LLama Index Framework vs Smolagent, RAG Cheating, API request limits


MCP (Glama) ▷ #general (56 messages🔥🔥):

Claude Plotly Charts, MCP Max Tokens, LLM Restrictions, Remote MCP Servers on Cloudflare, Java MCP Server Custom Args


MCP (Glama) ▷ #showcase (33 messages🔥):

MCP Client for STDIO, OpenLink Software AI Layer (OPAL), MCP Holster, AiraHub2, Sampling in MCP


Nous Research AI ▷ #general (58 messages🔥🔥):

Deepmind RL Robots vs China RL Robots, Linux Laptop vs Apple Macbook, Llama 4 disappoints, Automatic chat-moderation system blocks emojis


Nous Research AI ▷ #research-papers (1 messages):

ifeq: I gotta learn mandarin


Nous Research AI ▷ #interesting-links (5 messages):

Entropy Engine, Quantum-Native Randomness, LLM Sensitivity to Randomness, Importance of Randomness for AGI


Nous Research AI ▷ #research-papers (1 messages):

ifeq: I gotta learn mandarin


Yannick Kilcher ▷ #general (35 messages🔥):

Grok's apprehension of reality, Cloudflare serving fake content to agents, Third party filters for LLM output, Personal access to university resources via AI, KL Divergence Minimization


Yannick Kilcher ▷ #paper-discussion (7 messages):

Paper Presentations, Causality, CVPR, Proper Investiture, Daily Paper Discussion


Yannick Kilcher ▷ #ml-news (14 messages🔥):

Zed compilation on Windows, Biological brains vs backpropagation, LLM beats Factorio == ASI?


Eleuther ▷ #general (41 messages🔥):

Cursor Advertising, Slurm Memory Requests, Job Posting Channel, Linguistics Channel, Cursor primary IDE Correlation


Eleuther ▷ #research (7 messages):

MTurk vs. Prolific, RWKV's token shift


Eleuther ▷ #interpretability-general (2 messages):

The Pizza and the Clock


Eleuther ▷ #lm-thunderdome (3 messages):

LocalCompletionsAPI, loglikelihood tasks, bos token, HF model generation_config settings


Notebook LM ▷ #announcements (1 messages):

NotebookLM, Mobile App, Trusted Tester Program


Notebook LM ▷ #use-cases (9 messages🔥):

NotebookLM PDF Processing, NotebookLM Knowledge Base for Sales, Audio length limitations


Notebook LM ▷ #general (22 messages🔥):

NotebookLM failing to answer questions, Video Uploads, Audio Overview Functionality, Podcast Length, AI 'Humanic' Behavior


Latent Space ▷ #ai-general-chat (22 messages🔥):

X-Ware, Netflix Recommendation Model, Gemini Image Generation, aider postmortems, Suno Music


Latent Space ▷ #ai-announcements (2 messages):

Claude code pod, AI Engineer conference, Early Bird Tickets, AI Engineer conference speakers


Modular (Mojo 🔥) ▷ #general (15 messages🔥):

Fields in traits vs properties, Modular Hackathon at AGI House, Hardware Agnostic ML Systems Survey Paper, Zotero and bibtex for citations


Modular (Mojo 🔥) ▷ #mojo (4 messages):

Mojo roadmap, GPU programming puzzles, Colab Integration, New requires keyword


DSPy ▷ #general (13 messages🔥):

Collab and partnership, ReAct module signature, DSPy Caching Mechanism, RL experiment with GRPO on a Qwen 1.7B


Cohere ▷ #💬-general (7 messages):

Cohere Embedding Model, Cohere Rerank Model, Cohere Embed 4


Cohere ▷ #💡-projects (1 messages):

AI Cost Tracking, Multi-Platform AI Service Management, AI Expense Justification, AI Tool Frustrations


Cohere ▷ #🤝-introductions (3 messages):

Collaborations, Introductions


Cohere ▷ #🟢-status-updates (1 messages):

Embedding Models Degraded, embed-english-v2.0, embed-english-v3.0


Cohere ▷ #🎯-private-deployments (1 messages):

GPU Requirements, On-Premise Deployment of Command A


Torchtune ▷ #general (5 messages):

Tokenizer Automation, HuggingFaceBaseTokenizer Limitations, Custom Autotokenizer, ModelTokenizer Wrapper


Torchtune ▷ #dev (8 messages🔥):

Cosine Scheduler with Warmup, Pytorch NaN bug with compiled Adam, Torchtune's get_cosine_schedule_with_warmup function, Torchtitan LR Scheduler Implementation, LR Warmup scaling


LlamaIndex ▷ #blog (3 messages):

Anthropic API web search tool, LlamaParse improvements, VoyageAI multi-modal embeddings and MongoDB indexes


LlamaIndex ▷ #general (4 messages):

Medical LLM Bot, Fine-tuning vdr-2b-multi-v1 with math formulas, Writer's Palmyra X5 and X4 in Bedrock


tinygrad (George Hotz) ▷ #general (4 messages):

tinygrad CUDA, tinygrad IR, tinygrad docs, tinygrad uops


tinygrad (George Hotz) ▷ #learn-tinygrad (3 messages):

CACHEDB environment variable


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

AgentX Workshop, Lambda Inference API, Agentic AI


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (4 messages):

HF Credits, Course Content, MOOC Iterations


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (2 messages):

AI Engineer Courses, LLM Agents MOOC


Codeium (Windsurf) ▷ #announcements (1 messages):

JetBrains Plugin Updates, Windsurf Editor UX Improvements, Wave 8 Release