Frozen AI News archive

not much happened today

**Meta** released **KernelLLM 8B**, outperforming **GPT-4o** and **DeepSeek V3** on KernelBench-Triton Level 1. **Mistral Medium 3** debuted strongly in multiple benchmarks. **Qwen3** models introduced a unified framework with multilingual support. **DeepSeek-V3** features hardware-aware co-design. **BLIP3-o** family released for multimodal tasks using diffusion transformers. **Salesforce** launched **xGen-Small** models excelling in long-context and math benchmarks. **Bilibili** released **AniSORA** for anime video generation. **Stability AI** open-sourced **Stable Audio Open Small** optimized for Arm devices. Google’s **AlphaEvolve** coding agent improved **Strassen's algorithm** for the first time since 1969. Research shows **chain-of-thought reasoning** can harm instruction-following ability, with mitigation strategies like classifier-selective reasoning being most effective, but reasoning techniques show high variance and limited generalization. *"Chain-of-thought (CoT) reasoning can harm a model’s ability to follow instructions"* and *"Mitigation strategies such as few-shot in-context learning, self-reflection, self-selective reasoning, and classifier-selective reasoning can counteract reasoning-induced failures"*.

Canonical issue URL

a quiet day.

AI News for 5/16/2025-5/19/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (215 channels, and 11148 messages) for you. Estimated reading time saved (at 200wpm): 947 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

It's an open secret that Google will be launching a lot of stuff at I/O tomorrow, and is already starting to roll out Jules. There've been other launches - Amazon's Strands Agents, and Anthropic's Claude Code SDK but nothing quite hitting title story status.

Expo Explorer tickets for AI Engineer went live over the weekend. If you love the hallway track, expo sessions, and meeting every top cloud/startup/employer in AI Eng, join us.

There's a limited number of discounts available here for the first 50 AINews readers.


AI Twitter Recap

AI Model Releases and Performance

AI Safety, Reasoning and Instruction Following

AI Tools and Applications

AI Business and Strategy

Infrastructure, Tools and Datasets

Humor


AI Reddit Recap

/r/LocalLlama Recap

1. Intel Arc Pro GPUs and Project Battlematrix Workstation Launches

2. Offline and Open Source AI Productivity and Speech Tools (Clara, Kokoro-JS, OuteTTS)

3. ParScale Model Launch and Parallel Scaling Paper

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Major AI-Driven Corporate Layoffs and Workforce Restructuring

2. Upcoming and Spotted AI Model Releases (Gemini, Claude, o1-pro)

3. AI Progress, Automation Impact, and SWEs Replacement Discourse


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Flash Preview

Theme 1. AI Agent Development and Orchestration Tools

Theme 2. LLM Performance, Evaluation, and Model Behavior

Theme 3. Hardware Performance and Low-Level Optimization

Theme 4. New AI Models, Research, and Emerging Concepts

Theme 5. AI Tooling and Platform Updates


Discord: High level Discord summaries

Perplexity AI Discord


Cursor Community Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


LMArena Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


HuggingFace Discord


GPU MODE Discord


Eleuther Discord


Latent Space Discord


aider (Paul Gauthier) Discord


MCP (Glama) Discord


Notebook LM Discord


Nous Research AI Discord


Yannick Kilcher Discord


LlamaIndex Discord


Modular (Mojo 🔥) Discord


tinygrad (George Hotz) Discord


DSPy Discord


Cohere Discord


Nomic.ai (GPT4All) Discord


LLM Agents (Berkeley MOOC) Discord


Torchtune Discord


MLOps @Chipro Discord


AI21 Labs (Jamba) Discord


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1047 messages🔥🔥🔥):

Cunnyx links, MCP SuperAssistant extension, Grok Deepsearch vs Perplexity Pro, Firefox promoting Perplexity, Yandex browser security


Perplexity AI ▷ #pplx-api (16 messages🔥):

API Credits, Sonar API vs UI Discrepancy, Sonar API Tweaking, Playground Outputs vs API Outputs


Cursor Community ▷ #general (1551 messages🔥🔥🔥):

Convex vs Supabase for real-time apps, MCP for AI agent communication, DeepSeek-R1T-Chimera model, Cursor speed issues, Document Navigation Within Cursor


Unsloth AI (Daniel Han) ▷ #general (422 messages🔥🔥🔥):

imatrix calibration dataset, Qwen3 GGUF issues, Estimate Cost of Running LLMs, Gemma 3 fine tune, AlphaEvolve the Google AI


Unsloth AI (Daniel Han) ▷ #off-topic (13 messages🔥):

Downloading adapters from Google Colab, Private Hugging Face Models, Modern LLMs besides Qwen


Unsloth AI (Daniel Han) ▷ #help (659 messages🔥🔥🔥):

TPU support, GGUF Saving Errors, Torch and Cuda errors, Unsloth Documentation, Continued Pretraining vs Lora


Unsloth AI (Daniel Han) ▷ #research (19 messages🔥):

PTS and DPO for Fine-Tuning, Beam Search with Trainable Permutations, Tokenizer Training and Embedding Research, Entropix GitHub Project


OpenAI ▷ #ai-discussions (569 messages🔥🔥🔥):

ASI Lab Plagiarism Accusations, Codex Experience, GPT-5, Gemini 2.5 Pro Math Performance, ChatGPT Memory Feature


OpenAI ▷ #gpt-4-discussions (414 messages🔥🔥🔥):

Gemini 2.5 Pro vs 4.1, Rate Limits, GPT Lying, ChatGPT for Education, 4o Mini


OpenAI ▷ #prompt-engineering (48 messages🔥):

HyperEnglish Prompting, AI Custom Instructions via Python, ProtoMind Semantic Mapping, Image Prompting Workflow, Learning Prompt Engineering


OpenAI ▷ #api-discussions (48 messages🔥):

HyperEnglish, Meta-prompt generator complexity, Loading Custom Instructions, Weighted procedural responses, ProtoMind_001


LMArena ▷ #general (900 messages🔥🔥🔥):

O3 Pro, GPT-5 speculation, Claude 4, DeepSeek's fate, Codex's potential


LMArena ▷ #announcements (2 messages):

Mistral Medium 3, Claude 3 Sonnet, Amazon Nova Pro, Command-a-03


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

Gemini 2.5 Pro Experimental deprecation, DeepSeek V3 maintenance


OpenRouter (Alex Atallah) ▷ #app-showcase (3 messages):

Chess Tournament, Stockfish Implementations, Lichess Accuracy Ratings, Openrouter models


OpenRouter (Alex Atallah) ▷ #general (790 messages🔥🔥🔥):

API key identification, Gemini 2.5 Deprecation Fallout, Qwen3 Tool Calling Troubles, Low Latency LLMs, Gemini API Updates


LM Studio ▷ #general (174 messages🔥🔥):

VRAM Usage, LM Studio conversations export, LM Studio blurry UI, Vulkan runtime troubleshooting, Prompt formatting issues


LM Studio ▷ #hardware-discussion (341 messages🔥🔥):

Intel Arc Pro B60 GPU, macOS vs Windows, AMD vs Nvidia GPU for LM Studio, Resizable BAR Impact, Multi GPU setup


HuggingFace ▷ #general (225 messages🔥🔥):

MCP Course Channel, AI Integration on ERP System, ACE Step Quality, AI Model for Design-to-Code Conversion, Hugging Face Pro Benefits


HuggingFace ▷ #today-im-learning (6 messages):

MCP for Atlassian, ChatApp AI App


HuggingFace ▷ #cool-finds (4 messages):

Strands Agents SDK, AI for OS Funding


HuggingFace ▷ #i-made-this (14 messages🔥):

EcoArt Cellular Automaton, Browser.AI with Tool Calls, tome: Local LLM Client with MCP Servers, Lazarus Small LLM, datatune Open Source Tool


HuggingFace ▷ #reading-group (1 messages):

arpitbansal.: By any chance recording available for the recent session??


HuggingFace ▷ #computer-vision (13 messages🔥):

WANVideo Lora Training, Invoice Extraction, Computer Vision Roadmap, Object Outlines, CS231n Lectures


HuggingFace ▷ #NLP (3 messages):

Modern Approaches to DDS, Inference Differences in BERT-style Models


HuggingFace ▷ #smol-course (11 messages🔥):

Claude "overloaded" issues, GPT's 30% success rate, Meta Llama access denied, Multiple agents setup, Questions endpoint problem


HuggingFace ▷ #agents-course (46 messages🔥):

Agents Course Certifications, MCP Course Confusion, Unit 4 Project File Retrieval, GAIA Formatting Issues, Hugging Face space stuck


GPU MODE ▷ #general (8 messages🔥):

Kernel development, AMD challenge, FSDP and Flash Attention 2 with trl


GPU MODE ▷ #triton (4 messages):

Triton Runtime Shared Memory, Triton on CPU, TRITON_INTERPRET API


GPU MODE ▷ #cuda (11 messages🔥):

Tensor Cores, CUDA Brute Forcer, GPU Usage Reporting, Neural Net Mutate Function, CUDA Errors


GPU MODE ▷ #torch (1 messages):

Triton Kernels, Dynamic Shapes, Batch Size in PyTorch


GPU MODE ▷ #announcements (1 messages):

CuTe, Cris Cecka


GPU MODE ▷ #algorithms (43 messages🔥):

Power Consumption for MCMC, Pbit Hardware, Analog Annealing Circuit, Quantum vs Pbit, Hardware for MCMC


GPU MODE ▷ #cool-links (2 messages):

External CUDA Allocator, MAXSUN Arc Pro B60 Dual


GPU MODE ▷ #beginner (1 messages):

Threading APIs, cuper alternative


GPU MODE ▷ #torchao (3 messages):

QAT with Llama3.2 3B, prepare_model_for_qat, prepare_model_for_ptq, bf16 vs int4, axolotl-ai-cloud/axolotl


GPU MODE ▷ #rocm (3 messages):

kpack argument in rocm triton, AMD Triton Performance Optimization


GPU MODE ▷ #lecture-qa (4 messages):

CuTe Tensors, Lecture Slides Availability


GPU MODE ▷ #self-promotion (3 messages):

SWEBench, CuTeDSL, AI Efficiency with Pruna AI


GPU MODE ▷ #🍿 (6 messages):

KernelLLM, PyTorch backend, RL baseline, leaderboard competitions, pass@k evals


GPU MODE ▷ #reasoning-gym (3 messages):

LLMs, Qwen 2.5 3B, Llama 3.2 3B, GSM8K and MATH benchmarks


GPU MODE ▷ #submissions (111 messages🔥🔥):

amd-fp8-mm Leaderboard, amd-mixture-of-experts Leaderboard, Submission Errors on MoE, amd-identity Leaderboard, hipcc Arguments


GPU MODE ▷ #status (1 messages):

Problem #3 Released, MLA crash course, Bot Timeouts reduced, Due Dates Extended


GPU MODE ▷ #ppc (1 messages):

llvm-mca usage, CPU execution estimation


GPU MODE ▷ #factorio-learning-env (14 messages🔥):

Space Age Compatibility, Long Term Vision for Work Group, vllm and openai client classes, Understanding use-cases and evaluation areas


GPU MODE ▷ #amd-competition (85 messages🔥🔥):

Mixture-of-experts Submission Issues, HIP Submissions, Popcorn CLI Output, Leaderboard Run Slower than Benchmark, Composable Kernel Library Error


GPU MODE ▷ #cutlass (11 messages🔥):

CUTLASS DSL 4.0, CuTeDSL blogpost, CuTeDSL examples, Layout Function


GPU MODE ▷ #mojo (2 messages):

CUTLASS 4.0, DSL, cuTile, Python, Triton


Eleuther ▷ #general (235 messages🔥🔥):

Document AI and NLP, Sociology in biomedical AI, AI_WAIFU's views on AGI, OpenWorldLabs, Diffusion Limitations


Eleuther ▷ #research (34 messages🔥):

Approximating LM Finetuning, ReLU Activation, Smooth Activation Function, Muon expensive operations, Grand Research Ideas


Eleuther ▷ #lm-thunderdome (8 messages🔥):

SpinQuant Llama-2-7b Reproduction, lm eval harness, HFLM modification


Eleuther ▷ #gpt-neox-dev (3 messages):

PolyPythia Materials, Pretraining Data, Random Seeds, GPT-NeoX hash


Latent Space ▷ #ai-general-chat (120 messages🔥🔥):

Freeplay.ai Feedback, Absolute Zero Reasoner (AZR), AI Agent Frameworks, OpenAI Codex Experiment, Perplexity Free Tier Costs


Latent Space ▷ #ai-announcements (1 messages):

swyxio: codex pod https://x.com/latentspacepod/status/1923532303327953295?s=46


Latent Space ▷ #ai-in-action-club (142 messages🔥🔥):

Meta's Maverick LLM, Agent as Judge, Economic disaster, Context Switching, Home Rolled context sharing


aider (Paul Gauthier) ▷ #general (173 messages🔥🔥):

Codex o4-mini model, Gemini 2.5 vs coding models, Aider settings, Aider's agent-like direction, Model preferences for Aider


aider (Paul Gauthier) ▷ #questions-and-tips (42 messages🔥):

Model iteration, Workplan document, Base prompt setup, Edit format issues, UI theme available


MCP (Glama) ▷ #general (155 messages🔥🔥):

Agent for MCP news, Application vs LLM driven, MCP client tool selection, Streaming HTTP crawl4ai MCP server, MCP vs OpenAPI


MCP (Glama) ▷ #showcase (25 messages🔥):

MCP UI SDK release, MCPControl server update, MCP SuperAssistant browser extension, Google Chat + LLM Agents via MCP, Sherlog Canvas (Alpha) release


Notebook LM ▷ #announcements (1 messages):

NotebookLM, Mobile App, I/O, MVP


Notebook LM ▷ #use-cases (28 messages🔥):

NotebookLM for Olympiad Prep, NotebookLM Fails to Upload Materials, Custom Instructions in NotebookLM, NotebookLM as Language Editor, Senpai meaning


Notebook LM ▷ #general (119 messages🔥🔥):

Audio generation issues, Mind-map conversion to Markdown, Formatting recognition in NotebookLM, Debate podcast creation, NotebookLM API availability


Nous Research AI ▷ #general (101 messages🔥🔥):

AWQ INT4 calculations, Gemini generating comments, Open Source AI vs Big Tech Oligopoly, Google's AI Code Agent Jules, Hermes models


Nous Research AI ▷ #ask-about-llms (9 messages🔥):

LLM Editing for Autonomous Bots, Fine-tuning vs RL vs Control Vectors, Nous Hermes Model


Nous Research AI ▷ #research-papers (3 messages):

LLMs Conventions, LLMs Collective Biases, LLMs Adversarial Agents


Nous Research AI ▷ #interesting-links (2 messages):

Gemini, MoE, Long Context Window, Sub-global attention blocks


Nous Research AI ▷ #research-papers (3 messages):

Decentralized LLM Populations, Emergence of Social Conventions, Collective Biases in LLMs, Adversarial LLM Agents


Yannick Kilcher ▷ #general (38 messages🔥):

CNN diagrams, matplotlib for diagrams, DiagramVIS-for-computervis, Gemini 2.5 Pro, geometric deep learning


Yannick Kilcher ▷ #paper-discussion (29 messages🔥):

AlphaEvolve Whitepaper, Physics of Language Models Discussion, Ecology of LLMs and Social Conventions, Loss Clamping


Yannick Kilcher ▷ #ml-news (34 messages🔥):

Leadership issues, Open Source AI, Sam Altman Strategies, Attention seeking transformers


LlamaIndex ▷ #announcements (1 messages):

LlamaIndex office hours, LlamaIndex Agents with Long-Term and Short-Term Memory, Multi-Agent Workflow with Weaviate QueryAgent, LlamaExtract for Structured Data


LlamaIndex ▷ #blog (4 messages):

LlamaParse updates, Azure AI Foundry Agent Service, LlamaIndex Discord office hours


LlamaIndex ▷ #general (58 messages🔥🔥):

COBOL code splitting, Claude desktop file drops, AgentWorkflow streaming with Anthropic, LlamaIndex and Ollama integration, Agent state persistence with DB


Modular (Mojo 🔥) ▷ #mojo (45 messages🔥):

NDBuffer deprecation and alternatives, ArcPointer and Atomic structs, Importing mojo code in notebooks, LSP issues and workarounds, Documentation issues in GPU basics tutorial


Modular (Mojo 🔥) ▷ #max (8 messages🔥):

Mojo kernel registration in PyTorch, Max vs. Fireworks.ai, Together.ai, and Groq.com for serving LLMs, register_custom_ops removal


tinygrad (George Hotz) ▷ #announcements (1 messages):

georgehotz: see tinygrad performance get better here https://stats.tinygrad.win


tinygrad (George Hotz) ▷ #general (50 messages🔥):

tinygrad use GCC instead of Clang, porting a model or weights to TinyGrad, tinygrad 1.0 plan, quantize onnx bug, get torch.index_put to work


DSPy ▷ #show-and-tell (1 messages):

AI Agent Engineering, LLMs & Foundation Models, Full-Stack & Backend Systems, Automation & Agent Ops, Vector DBs & Memory Storage


DSPy ▷ #general (39 messages🔥):

Assert/Suggest replacement, VS Code theme settings, AI coding agent with DSPy, DSPy latency with large system prompts, DSPy 3.0 port to Elixir


Cohere ▷ #💬-general (7 messages):

Fine-tuning embedding models, Unwanted embeddings from embed-v4.0


Cohere ▷ #🔌-api-discussions (5 messages):

Embed 4 Pricing, Vision RAG, Chat API Stalling, Agent API Calls


Cohere ▷ #💡-projects (1 messages):

Vitalops datatune, Open source data transformation


Cohere ▷ #🤝-introductions (3 messages):

Game Development, AI/ML Engineering, 3D Game Development, AI-powered NPCs, Skills in Game Engines


Nomic.ai (GPT4All) ▷ #general (13 messages🔥):

Swarm UI, Linus Tech Tips $1M Server for Pi Calculation, Customer Success career advice, Models for text interpretation and formatting


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

AgentX Competition, Submission Forms, Judging Panel, Entrepreneurship Track, Research Track


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

OpenAI API keys, Alternative approaches to API calls, Trailblazer Tier Certificate


Torchtune ▷ #general (1 messages):

Model Evaluation, Finetuning, Continuous Integration


Torchtune ▷ #dev (1 messages):

Torchtune cfg.get


MLOps @Chipro ▷ #general-ml (1 messages):

DataTune, Data transformation, Vitalops


AI21 Labs (Jamba) ▷ #general-chat (1 messages):

``