Frozen AI News archive

not much happened today

**Gemini 2.5 Flash** shows a **12 point increase** in the Artificial Analysis Intelligence Index but costs **150x more** than Gemini 2.0 Flash due to **9x more expensive output tokens** and **17x higher token usage** during reasoning. **Mistral Medium 3** competes with **Llama 4 Maverick**, **Gemini 2.0 Flash**, and **Claude 3.7 Sonnet** with better coding and math reasoning at a significantly lower price. **Alibaba's Qwen3** family supports reasoning and multilingual tasks across **119 languages** and includes a **Web Dev** tool for app building. **Huawei's Pangu Ultra MoE** matches **DeepSeek R1** performance on Ascend NPUs, with new compute and upcoming V4 training. **OpenAI's o4-mini** now supports **Reinforcement Fine-Tuning (RFT)** using chain-of-thought reasoning. **Microsoft's X-REASONER** enables generalizable reasoning across modalities post-trained on general-domain text. Deep research integration with GitHub repos in ChatGPT enhances codebase search and reporting. The AI Engineer World's Fair offers an Early Bird discount for upcoming tickets.

Canonical issue URL

a quiet day.

AI News for 5/8/2025-5/9/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (215 channels, and 4687 messages) for you. Estimated reading time saved (at 200wpm): 486 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

It's a pretty quiet weekend, so we'll plug our AI Engineer World's Fair writeup — last chance for the Early Bird discount for those who haven't yet got tickets!


AI Twitter Recap

Large Language Models (LLMs) and Model Performance

AI Applications and Tools

AI Safety and Alignment

People and Companies

General AI Discussions and Insights

Humor/Memes


AI Reddit Recap

/r/LocalLlama Recap

1. Advanced Local LLM Inference Optimization Tips

2. Local and Open LLMs for Web Development and Accessibility

3. Upcoming OpenAI Open-Source Model Announcements

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. AI-Generated Content Trends on Reddit

2. New AI Models, Benchmarks, and Open-Source Releases

3. Advances and Industry Movement in Robotics and Embodied AI


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: The Bleeding Edge of LLMs: Performance Rollercoasters, Bug Hunts, and Emerging Capabilities

Theme 2: Fine-Tuning Follies & Framework Fixes: Navigating the Developer Toolchain

Theme 3: GPU Jockeys & Hardware Hustles: Squeezing Every FLOP for AI

Theme 4: API Acrobatics & Integration Ills: Making Models Play Nice

Theme 5: Multimodal Marvels & Output Oddities: Beyond Just Text


Discord: High level Discord summaries

LMArena Discord


Perplexity AI Discord


Manus.im Discord Discord


Unsloth AI (Daniel Han) Discord


Yannick Kilcher Discord


Notebook LM Discord


Cursor Community Discord


GPU MODE Discord


LM Studio Discord


aider (Paul Gauthier) Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


HuggingFace Discord


MCP (Glama) Discord


Modular (Mojo 🔥) Discord


Nous Research AI Discord


Torchtune Discord


Cohere Discord


Nomic.ai (GPT4All) Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


LLM Agents (Berkeley MOOC) Discord


DSPy Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (671 messages🔥🔥🔥):

Grok 3.5 release hype, Qwen 3 analysis, Gemini 2.5 Pro nerfed, Veo 3 and Imagen 4, GPT-4o performance


Perplexity AI ▷ #general (475 messages🔥🔥🔥):

Deep Research API, Deep Search, Grok loop, High Quality GPT Image Gen, Image based search


Perplexity AI ▷ #pplx-api (11 messages🔥):

Perplexity API, API image metadata, Domain Filtering Upgrade


Manus.im Discord ▷ #general (458 messages🔥🔥🔥):

Manus credits, Gemini Advance, Germy Remix, Music generation


Unsloth AI (Daniel Han) ▷ #general (240 messages🔥🔥):

Embeddings resizing fix, BFloat11 finetuning, Qwen2.5 chat template, Synthetic data notebooks, Unsloth support BERT


Unsloth AI (Daniel Han) ▷ #off-topic (16 messages🔥):

IBM Granite 4.0, Mamba models, hf install from source, agentic behaviour finetune, vending-bench


Unsloth AI (Daniel Han) ▷ #help (124 messages🔥🔥):

ORPO finetuning, Tokenizer resizing issues, SageMaker installation errors, Qwen 2.5 SQL finetuning, Unsloth and Whisper


Unsloth AI (Daniel Han) ▷ #research (27 messages🔥):

TTS Models, Encoder Embeddings Fine-Tuning with Unsloth, Mistral with Liquid Time Layers, Multilingual LLM Fine-Tuning


Yannick Kilcher ▷ #general (328 messages🔥🔥):

Unsloth installation on Sagemaker, AI-generated content detection, Rickrolling, AI vs Brain, Declarations of Root Authority


Yannick Kilcher ▷ #paper-discussion (3 messages):

ARXIV 2305.13673, Allen-zhu papers


Yannick Kilcher ▷ #ml-news (33 messages🔥):

AGI and Game-Playing AI, DeepMind's SIMA Agent, Kerbal Space Program as a Tough AI Test, LLMs and Advertising, Bias in LLM Recommender Systems


Notebook LM ▷ #use-cases (19 messages🔥):

Handwritten Notes in NotebookLM, Hallucinations in NotebookLM, Mind Map Feature, Obsidian Integration


Notebook LM ▷ #general (287 messages🔥🔥):

NotebookLM Mobile App, Audio Podcast voices, Beta Testers, Access Issues, Source Preview Bug


Cursor Community ▷ #general (257 messages🔥🔥):

Cursor Pro Subscription Issue, Gemini's Slow Requests, Student Discount Availability, githubRepo Tool, Gemini Model


GPU MODE ▷ #general (36 messages🔥):

Bitonic Sort for Shaders, Flag array for storing intersection state, BVH, octrees/KD-trees, File submit failure, GPUMODE Youtube account


GPU MODE ▷ #triton (1 messages):

Triton Usage Survey


GPU MODE ▷ #cuda (33 messages🔥):

Vast.ai data security, nsys profiling issues, CUDA memory copy errors, Array-of-Structs-of-Arrays design antipattern


GPU MODE ▷ #torch (9 messages🔥):

torch.compile performance degradation, Tensor Parallel hangs on dist.broadcast, LLM Deploy Project Debugging, Seeding and Deterministic Algorithms in PyTorch


GPU MODE ▷ #cool-links (1 messages):

Multiplayer World Model, World Model


GPU MODE ▷ #beginner (34 messages🔥):

CUDA Prerequisites, Torch Internals, Mojo adoption, Copy-on-write memory access in CUDA, NVCC generating 128-wide loads and stores


GPU MODE ▷ #torchao (1 messages):

PyTorch Autotuning, TorchAO Release


GPU MODE ▷ #off-topic (3 messages):

Chip Networking Latency, Router Slowdown, Speed of Light calculation, Culinary Photo, Internal Chip Latency


GPU MODE ▷ #irl-meetup (2 messages):

Modular Hackathon, IRL Meetup Planning


GPU MODE ▷ #rocm (1 messages):

ROCm, nvbench, hipbench, googlebench


GPU MODE ▷ #self-promotion (8 messages🔥):

SASS code generation, GPU simulation in Kubernetes, Voxel raytracing engine


GPU MODE ▷ #🍿 (1 messages):

Competition Organization, KernelBench


GPU MODE ▷ #thunderkittens (2 messages):

ThunderKittens, Cutlass, Live Stream


GPU MODE ▷ #reasoning-gym (1 messages):

artnoage: Thanks for the answer 🙂


GPU MODE ▷ #submissions (74 messages🔥🔥):

MI300 Leaderboard Updates, AMD-FP8-MM Performance, µs and ms benchmarks


GPU MODE ▷ #hardware (3 messages):

Nvidia L40S GPU Upgrade, Nvidia Thor architecture, Nvidia Blackwell RTX Pro, Nvidia B300 and DGX Spark


GPU MODE ▷ #factorio-learning-env (25 messages🔥):

Good First Issues, Claude vs Gemini, Blender Agents with Gemini Pro, Agents craft their own observation state, Twitch stream


GPU MODE ▷ #amd-competition (13 messages🔥):

CLI submission mean time, Triton compile times, Fused MoE Github Repo, Warmup runs, Speed of light benchmark FP8


GPU MODE ▷ #cutlass (3 messages):

ThunderKittens vs Cutlass, Blackwell MMA, CuTe Implementations


GPU MODE ▷ #mojo (3 messages):

Mojo GPU kernels, PTX code


LM Studio ▷ #general (213 messages🔥🔥):

LM Studio Hub Page, MCP Server Security, Open WebUI Integration, duckduckgo searxng, kokoro-onnx in rust


LM Studio ▷ #hardware-discussion (37 messages🔥):

Refurbished Hardware, M2 vs Intel, B500 Series Speculation, Inference on AMD D700, HWINFO


aider (Paul Gauthier) ▷ #announcements (1 messages):

Gemini 2.5 Pro, Qwen3-235b, OCaml repo-map, Knight Rider spinner animation, Co-author trailer commits


aider (Paul Gauthier) ▷ #general (146 messages🔥🔥):

Claude Code vs Aider, Copilot Proxy, Gemini 2.5 performance, Qwen 3 Cost-Performance, Aider and Read-Only files


aider (Paul Gauthier) ▷ #questions-and-tips (53 messages🔥):

Discord Matrix Bridge, Gemini 2.5 Flash, DeepSeek R1, Aider with LM Studio on Linux, Architect Mode


OpenAI ▷ #ai-discussions (176 messages🔥🔥):

ChatGPT UI History, Image Generation on Google Colab, DeepSeek Server Issues, Blue Dot in ChatGPT, GPT-4o Iterations


OpenAI ▷ #gpt-4-discussions (2 messages):

Structured outputs with OpenAI Assistants, PyTorch loss output meme


OpenAI ▷ #prompt-engineering (6 messages):

GPT deep search prompts, Style/subject transfer in concept art, WonderScholar meta-prompt


OpenAI ▷ #api-discussions (6 messages):

GPT deep search, Style transfer prompts, WonderScholar meta-prompt


OpenRouter (Alex Atallah) ▷ #announcements (28 messages🔥):

Gemini 2.5 Pro Implicit Caching, AI Studio, TTL and Refresh, Token count for 2.5 Pro, Gemini 2.5 Flash


OpenRouter (Alex Atallah) ▷ #general (148 messages🔥🔥):

Gemini 2.5 Flash, OpenRouter + AI, Activity Page Bug, Claude 2.1 & 2 dead?, OpenRouter Rate Limits


HuggingFace ▷ #general (64 messages🔥🔥):

Hugging Face Pro B200, HF Inference API, Zero GPU, AI Agent Frameworks, OPEA 1.3 Release


HuggingFace ▷ #today-im-learning (2 messages):

TensorFlow Binary Conversion, TensorFlow.js Converter


HuggingFace ▷ #i-made-this (2 messages):

LLM Uncertainty Quantification (UQ), Multilingual dataset README


HuggingFace ▷ #computer-vision (4 messages):

PaddleOCR for text extraction, Dynamic form processing, PCA Foot Mask, Shoe rendering on foot


HuggingFace ▷ #agents-course (28 messages🔥):

429 Errors, Youtube transcript size, GAIA leaderboard, Ollama in HF space, Dummy agent library


MCP (Glama) ▷ #general (80 messages🔥🔥):

Postgres MCP Server, Sampling discussion, VSCode becoming AI IDE, Public MCP Server Options, Redis room for every chat ID


MCP (Glama) ▷ #showcase (7 messages):

MCP Orchestration Layer, Daily Briefing Automation, Notion CRM Updates, MCP Server Development, Local Log Viewer Update


Modular (Mojo 🔥) ▷ #general (4 messages):

MLPerf benchmarks, AMD MI300x, Mojo Benchmarks


Modular (Mojo 🔥) ▷ #mojo (48 messages🔥):

Memoization/Caching with Dictionaries in Mojo, Rationale Behind 'out' Argument in Mojo Functions, Implicit vs Explicit Trait Conformance in Mojo, Static Optional Type Proposal, Trait Composition


Modular (Mojo 🔥) ▷ #max (5 messages):

Modular package installation with Pixi, Alternatives to 'magic' wrapper, Using pip or uv for Modular, max-pipelines conda package


Nous Research AI ▷ #general (29 messages🔥):

Forbidden Emojis, Vatican Compute Power, Telegram Bots for Nous AI, Remote Access for Computing, Mac vs PC for AI


Nous Research AI ▷ #ask-about-llms (9 messages🔥):

Nous Hermes Uncensored, Uncensored LLMs, System Prompts


Nous Research AI ▷ #research-papers (1 messages):

burnytech: https://fxtwitter.com/AndrewZ45732491/status/1919920459748909288


Nous Research AI ▷ #interesting-links (5 messages):

AI Language Model on Windows 98, New AI model


Nous Research AI ▷ #research-papers (1 messages):

burnytech: https://fxtwitter.com/AndrewZ45732491/status/1919920459748909288


Torchtune ▷ #general (3 messages):

Tool Use, apply_chat_template, Jinja


Torchtune ▷ #dev (24 messages🔥):

Optimizer in backward removal, Distributed recipes, Memory savings, FSDP CPU offload, Gradient memory


Cohere ▷ #💬-general (7 messages):

NORTH platform, Paying for API key, Rate Limit Exceeded, Trial Key, VPN issue


Cohere ▷ #🔌-api-discussions (14 messages🔥):

Azure AI SDK, Cohere Embeddings, Azure AI Inference, Cohere SDK


Cohere ▷ #🤝-introductions (3 messages):

IIT Kharagpur student introduction, GenAI and Voice Agents, Python3, Vite, TS, AI R&D collaboration


Nomic.ai (GPT4All) ▷ #general (6 messages):

Jinja Template for Nous-Hermes-2-Mistral-7B-DPO, GPT4All Custom API, PrivateGPT, Qwen3 support


LlamaIndex ▷ #blog (1 messages):

VoyageAI Multi-Modal Embeddings, MongoDB Atlas Vector Store, Multi-Modal Retrieval


LlamaIndex ▷ #general (2 messages):

.edu email access, Qwen2.5-VL-7B-Instruct-AWQ memory usage, VLLM memory allocation


LlamaIndex ▷ #ai-discussion (1 messages):

NERDAi's vector institute


tinygrad (George Hotz) ▷ #general (4 messages):

codegen, UOp, kernel-per-level, webgpu demo


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Lambda, AgentX


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

Certificate Timeline, AgentX judging, Coursework deadline