Frozen AI News archive

Prime Intellect's INTELLECT-2 and PRIME-RL advance distributed reinforcement learning

**Prime Intellect** released **INTELLECT-2**, a decentralized GPU training and RL framework with a vision for distributed AI training overcoming colocation limits. **ByteDance** launched **DreamO**, a unified image customization model on Hugging Face. **Qwen** released models optimized for GPTQ, GGUF, and AWQ quantization. **Gemma** surpassed 150 million downloads on Hugging Face. **Meta** released weights for the **Dynamic Byte Latent Transformer** and the **Collaborative Reasoner** framework to improve language model efficiency and reasoning. **RunwayML** introduced **Gen-4 References**, a near-realtime model requiring no fine-tuning. **Mistral AI** released **Mistral Medium 3**, a strong multimodal model, and **Le Chat Enterprise**, an agentic AI assistant for business. **Google** updated **Gemini 2.5 Pro Preview** with video understanding and UI improvements. *"Airbnb for spare GPUs from all over the world"* highlights the ongoing challenges and potential of distributed GPU training.

Canonical issue URL

Distributed GPUs are all you need?

AI News for 5/9/2025-5/12/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (215 channels, and 12925 messages) for you. Estimated reading time saved (at 200wpm): 1292 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

The Dream: "Airbnb for spare GPUs from all over the world"

The Reality: colocation for GPUs has been so important that calls for trillion dollar clusters have actually materialized.

In this age of accelerating progress, the optimist's trap lies in areas where the promise far exceeds practical reality, especially one where the reality runs in to hard constraints like the speed of light.. It's generally been very difficult to know which of the many attempts at "federated learning" or "distributed training" work stick around long enough to actually get traction. For reasons like these (as well as the simpler reason of lack of understanding), we so far have steered away from covering similar attempts like Nous Research's work on DisTrO despite a lot of excitement from an excitable community. Furthermore, since the AI Engineer focus is very inference oriented, it really doesn't matter what GPU cluster a given model was trained on, further limiting practical industry interest.

However, Prime Intellect's work feels a little different.

INTELLECT-2's release isn't just a paper, or a QwQ finetune, or an RL framework, or opaquely blockchainy techniques, or Yet Another GRPO variant. It's all that and more - a proof of concept and a vision statement and perhaps a very baby steps first articulation of why decentralization has any place in the default-centralizing world of AI:

image

Model trainers should look at Prime-RL, but the paper also contains interesting insights as to some of the very valid frontiers in both post-training:

image

and inference-during-training (which they correctly observe will scale a lot in the RL era)

image


AI Twitter Recap

AI Model Releases and Updates

AI Engineering and Tooling

Agent Based Systems and Multi-Agent Systems

LLM Evaluation and Benchmarking

Key Ideas and Research Directions

Academia and Papers

Vision Language Models (VLMs)

Career and Industry

Humor/Memes


AI Reddit Recap

/r/LocalLlama Recap

1. Major LLM and Transformer Model Launches (Qwen3, INTELLECT-2, Meta 8B BLT)

2. Microsoft ARTIST Framework for Agentic Tool-augmented LLMs

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Recent Model and Feature Launches (Manus AI, JoyCaption, Continuous Thought Machines)

2. Major Model and Industry Trend Analysis (Microsoft/LLMs, Copyright Office, AI Researcher on ChatGPT issues)

3. Community Dissatisfaction and Behavioral Shifts in ChatGPT Usage


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: The Model Gauntlet: New Releases, Performance Showdowns, and Lingering Quirks

Theme 2: Rise of the Agents: Frameworks, Finetuning, and Interoperability Efforts

Theme 3: Powering Up: Hardware Hustles, Local LLM Deployments, and Optimization Frontiers

Theme 4: Framework Frontiers: Innovations in DSPy, LlamaIndex, and Specialized Tooling

Theme 5: Reality Check: Benchmarking Battles, Hallucination Headaches, and Ethical Enigmas


Discord: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LMArena Discord


LM Studio Discord


Cursor Community Discord


Yannick Kilcher Discord


Manus.im Discord Discord


OpenRouter (Alex Atallah) Discord


GPU MODE Discord


OpenAI Discord


aider (Paul Gauthier) Discord


HuggingFace Discord


Notebook LM Discord


Latent Space Discord


MCP (Glama) Discord


Nous Research AI Discord


Eleuther Discord


Modular (Mojo 🔥) Discord


DSPy Discord


Nomic.ai (GPT4All) Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


LLM Agents (Berkeley MOOC) Discord


Cohere Discord


MLOps @Chipro Discord


Torchtune Discord


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1055 messages🔥🔥🔥):

Legendary Smurfs, Gemini Multistep Search, Perplexity Text Mode, AI Watermark Detection, Qwen Performance


Perplexity AI ▷ #sharing (3 messages):

rocket evolution, gunrunning, corruption in war


Perplexity AI ▷ #pplx-api (17 messages🔥):

Image URL bug, Image Metadata Missing, Enhanced Domain Filtering, JSON Output Issues with API, API vs Web UI Results


Unsloth AI (Daniel Han) ▷ #general (799 messages🔥🔥🔥):

GGUF Quantization, Unsloth's Dynamic 2.0 Quantization, Lora sharing platform, DeepSeek R2 Rumors, Qwen3 finetuning


Unsloth AI (Daniel Han) ▷ #off-topic (251 messages🔥🔥):

Agentic behavior finetuning, Training data secret sauce, Tool calling implementation, Memory scoping for chatbots, Qwen3 incompatibility


Unsloth AI (Daniel Han) ▷ #help (455 messages🔥🔥🔥):

Optimizer State in Unsloth, AMD Max+ 365 SoC with ROCm Support, Qwen 2.5 3B GRPO Notebook, Synthetic Data for Knowledge Distillation, Deepseek R1 model IQ_M on LM Studio


Unsloth AI (Daniel Han) ▷ #showcase (7 messages):

CodeFIM Model, Rust, Unsloth, Hugging Face, CodeFIM dataset


Unsloth AI (Daniel Han) ▷ #research (37 messages🔥):

Memory Layers in Models, QLora and Lora for Pretraining, ModernBERT Notebook, Gemma 3 vs Qwen 0.6B, Absolute Zero Reasoning


LMArena ▷ #general (1168 messages🔥🔥🔥):

Grok 3.5, Gemini 2.5 Ultra, Drakesclaw performance, o3 pro release date, AI-undetectable essays


LM Studio ▷ #general (432 messages🔥🔥🔥):

lm studio db, lm studio web search, absolute zero reasoner, Qwen-3 models, DRY Sampler requests


LM Studio ▷ #hardware-discussion (760 messages🔥🔥🔥):

M3 Ultra mac studio, AMD Ryzen AI Max 395 Mini, NVidia RTX 5090 Pricing and Performance, GPU Temp monitoring, LLama Performance


Cursor Community ▷ #general (804 messages🔥🔥🔥):

Cursor v0.50 Rollout, Stagewise Integration, Pricing Model Confusion, Context Window Limitations, Gemini 2.5 Pro Issues


Yannick Kilcher ▷ #general (451 messages🔥🔥🔥):

Emergent Properties, LLM Reasoning, Transformers Limitations, Turing Completeness, RL Training


Yannick Kilcher ▷ #paper-discussion (23 messages🔥):

Global Optimization, Cultural Optimization, Sakana AI CTM, Video Summaries for Guiding Reading, Paper Discussion Postponed


Yannick Kilcher ▷ #agents (1 messages):

Sakana, Time Importance, Maze examples and ARC


Yannick Kilcher ▷ #ml-news (59 messages🔥🔥):

RL for Truthfulness, Claude.ai Web UI, Trump Fires Copyright Office Head, Confident Prompts Cause Hallucinations


Manus.im Discord ▷ #general (758 messages🔥🔥🔥):

Elevenlabs TTS, Britney Spears Parody, Manus AI Agent Training, Open Source Models, Manus Subscription Model


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

OpenRouter, Google AI Studio rate limits, Gemini 2.5 Pro Experimental


OpenRouter (Alex Atallah) ▷ #general (658 messages🔥🔥🔥):

Claude 3.7 Caching on Vertex, GPTs Agent Training, Open Empathic Project Assistance, Gemini 2.5 Pro's BYOK Issues, Grok 3.5 Release


GPU MODE ▷ #general (25 messages🔥):

NVIDIA 50 series, Model Optimization Libraries, Intel GPU drivers, Local Testing Configurations


GPU MODE ▷ #triton (6 messages):

Triton user survey, tl.make_block_ptr, gemlite fp16xfp4 support


GPU MODE ▷ #cuda (78 messages🔥🔥):

Array-of-Structs design antipattern, Sparse Matrix Formats, Multi-GPU programming with CUDA Streams, Thread Indexing Struggles


GPU MODE ▷ #torch (5 messages):

Torch export specializes batch size, torch.manual_seed redundancy, debugging specialized batch size in torch.export


GPU MODE ▷ #jobs (4 messages):

TII AI Infrastructure Engineer, nScale Staff AI Engineer, Isomorphic Labs Performance Engineer, C-Gen AI Senior Software Engineer


GPU MODE ▷ #beginner (15 messages🔥):

Statistics for GPU Performance, PC vs GPU Architecture, Ways to Lie With Statistics


GPU MODE ▷ #jax (1 messages):

XLA HLO file comparison, Op fusion identification, Performance improvement analysis, HLO graph analysis tools, JAX optimization verification


GPU MODE ▷ #torchao (22 messages🔥):

ao installation, pip version, virtual env, pyproject.toml


GPU MODE ▷ #off-topic (5 messages):

Hacksat Development, Unikernels vs Microkernels, Plov meal


GPU MODE ▷ #irl-meetup (5 messages):

MLSys Conference, j4orz's research and hacking, Work Group


GPU MODE ▷ #rocm (4 messages):

ROCm Benchmarking, NVBench Alternatives, GEMM Benchmarking Frameworks, memcpyPeer Benchmarks, Cache Clearing in Benchmarks


GPU MODE ▷ #self-promotion (1 messages):

Mobicham presentation


GPU MODE ▷ #🍿 (1 messages):

hj1231121: How do I request access to this?


GPU MODE ▷ #thunderkittens (4 messages):

TK 4 hour livestream, TK intro video


GPU MODE ▷ #gpu模式 (1 messages):

eclouder: re-register


GPU MODE ▷ #submissions (210 messages🔥🔥):

amd-fp8-mm leaderboard, MI300 performance, vectoradd benchmarks, amd-mixture-of-experts leaderboard


GPU MODE ▷ #hardware (12 messages🔥):

SM Architecture Speculation, H100 vs B200, CUTLASS Tutorial for Blackwell


GPU MODE ▷ #factorio-learning-env (11 messages🔥):

UV lock file, Factorio setup, Contribution Documentation


GPU MODE ▷ #amd-competition (68 messages🔥🔥):

Fused MoE, MI300 Access, Kernel Timeouts, GPU Page Faults, IR Dump Triton


GPU MODE ▷ #cutlass (2 messages):

Triton performance, cutlass register/shared memory


GPU MODE ▷ #mojo (14 messages🔥):

Mojo GPU PTX Dumping, Python and Mojo Interop Layer for MAX, Modular Hackathons Future Plans, Mojo+PyTorch Integration, Dot product Mojo


OpenAI ▷ #annnouncements (1 messages):

HealthBench, Evaluation Benchmark


OpenAI ▷ #ai-discussions (377 messages🔥🔥):

Gemini 2.5 Pro vs OpenAI Models, Grok 3.5 Release Delay, Local LLM Setup with LM Studio, ChatGPT's Memory Management, GPT-4's Self-Referential Identity ('Quill')


OpenAI ▷ #gpt-4-discussions (10 messages🔥):

PyTorch Loss Output, Chat AI Bot Identification, ChatGPT 4o IT Errors


OpenAI ▷ #prompt-engineering (24 messages🔥):

Bridger Palmer Clone, Financial Advice Prompts, Deep Research Prompts, Triggers and Addiction Prompts, Economic Correlations in Brazil


OpenAI ▷ #api-discussions (24 messages🔥):

Bridger Palmer Clone GPT, Financial Advice from GPT, Prompts for Deep Research, Prompt Engineering Basics, Counterintuitive Economic Correlations in Brazil


aider (Paul Gauthier) ▷ #announcements (1 messages):

Gemini 2.5 Pro, Qwen3, OCaml, OpenRouter, Playwright


aider (Paul Gauthier) ▷ #general (289 messages🔥🔥):

Azure OpenAI Model Routing, Aider's Production vs Development Features, Aider's auto-test output stall, Gemini 2.5 Pro issues, Aider's Potential for Multi-Agent Framework Integration


aider (Paul Gauthier) ▷ #questions-and-tips (71 messages🔥🔥):

Aider Prompting Modification, Architect Mode, Aider File Changes, Repo-Map, Agentic AI


HuggingFace ▷ #announcements (1 messages):

Gradio ImageSlider, DeepSeek Prover v2, Tiny Agents Local, LeRobot Hackathon, Mellum Open Source


HuggingFace ▷ #general (161 messages🔥🔥):

H200 serverless spaces, HF Discord Alerts, Training models from HF datasets, Lipsync AI tools, Training foundation models professionally


HuggingFace ▷ #today-im-learning (23 messages🔥):

Tensorflow to binary conversion, safetensors to .bin conversion, GGUF format for models, Ollama curriculum generator, Knowledge graphs and agentic AI


HuggingFace ▷ #i-made-this (18 messages🔥):

Huggingface Desktop app, Agentle AI agent framework, Cyberdesk virtual desktop control for AI, SlashML Gradio app hosting, OpenGoody LLM


HuggingFace ▷ #computer-vision (8 messages🔥):

ControlNet shoe generation, PCA for shoe design, Foot mask video creation, Image coordinate systems, Alpha blending for video


HuggingFace ▷ #smol-course (6 messages):

Discord integration, JSON files processing, AI Agents course help


HuggingFace ▷ #agents-course (96 messages🔥🔥):

Agent Debugging, Final Project Cheating, Rate Limit Errors, Chess Puzzle Solution


Notebook LM ▷ #use-cases (35 messages🔥):

NotebookLM Agents, Zundamon video generation, CraigBot Integration, HTML sources SEC.gov filings


Notebook LM ▷ #general (275 messages🔥🔥):

NotebookLM Logo Explanation, Audio File Duration Reduction, PDF Reading within NotebookLM, Source Preview Bug, GitHub Repositories and Overviews


Latent Space ▷ #ai-general-chat (91 messages🔥🔥):

AI Automation Freelancers, AI Demos for non-techies, AI Wrappers, LLM Long-Term Memory, Sakana AI


Latent Space ▷ #ai-in-action-club (126 messages🔥🔥):

AnswerHQ, Supabase, LLM as judge, Windsurf vs Cursor, Revenue Driven Development


MCP (Glama) ▷ #general (167 messages🔥🔥):

MCP Client TypeScript SDK, FastMCP with Python, Goose MCP client, Claude Desktop MCP client, Publicly available SSE MCP servers


MCP (Glama) ▷ #showcase (17 messages🔥):

Square MCP Architecture, AiraHub MCP/A2A Network, fabric-mcp-server, mcp-v8 JavaScript MCP Server, MCP-S Platform


Nous Research AI ▷ #announcements (1 messages):

RL Environments Hackathon, Speakers, Judges


Nous Research AI ▷ #general (139 messages🔥🔥):

LlamaCPP control vectors, Atropos artifact, AlphaZero and Absolute Zero paradigm trend, Daoist principles applied to machine learning, Unsloth Dynamic 2.0 GGUF quants


Nous Research AI ▷ #ask-about-llms (1 messages):

VL-Rethinker


Nous Research AI ▷ #research-papers (1 messages):

RLVR, Absolute Zero Reasoner, Self-play Reasoning


Nous Research AI ▷ #interesting-links (14 messages🔥):

JakeABoggs benchmark, MTG AI models, Gradient Descent Local Minima, Zed Editor Founder Ethos, Facebook Byte Latent Transformer


Nous Research AI ▷ #research-papers (1 messages):

RLVR, Absolute Zero, AZR, Self-play Reasoning, Reinforcement Learning


Eleuther ▷ #general (26 messages🔥):

4o-mini-preview-03-05 LLM performance, AI in Education and Ethics, LLMs for RL, AI Governance and Regulation, AI Parent phone app legal hurdles


Eleuther ▷ #research (63 messages🔥🔥):

Transfer steering vectors, ReLU problems, Multi-index models of feature learning, Continuity, Distributed neural architectures


Eleuther ▷ #interpretability-general (18 messages🔥):

Physics of LLMs, ICML tutorial, Interpretability for AI safety, Interpretable-by-design architecture


Eleuther ▷ #lm-thunderdome (5 messages):

o3 Performance Degradation, Global MMLU Inconsistencies


Modular (Mojo 🔥) ▷ #general (4 messages):

Disable Telemetry, H100 Backend, GPU/CPU Info


Modular (Mojo 🔥) ▷ #mojo (59 messages🔥🔥):

Autotuning removal, Post-hoc trait conformance, BigInt support, Mojo DataFrames, Mojo JIT compilation


Modular (Mojo 🔥) ▷ #max (15 messages🔥):

Modular meta-package, MAX graph in Mojo, Custom ops documentation, MAX Mojo APIs open-sourced, Progressively larger tutorials for max graph


DSPy ▷ #papers (1 messages):

complete: Curious on thoughts on implementing DSPy with this https://arxiv.org/abs/2505.03335v2


DSPy ▷ #general (39 messages🔥):

DSPy Doctrine, RL based on embedding models with DSPY, Prompts as weights, AI in Insurance conference presentation using DSPy, DSPy and LangGraph


Nomic.ai (GPT4All) ▷ #general (35 messages🔥):

Qwen3 Support, LLM applications, Nvidia & AMD hardware pricing, Image generation model in GPT4ALL, GPT4ALL installation help


tinygrad (George Hotz) ▷ #general (13 messages🔥):

ROCm build on Mac, Tinybox Sales Internship in San Diego, Tinygrad backend for AI, LeetGPU adds Tinygrad support, Optimal kernel block size calculation


tinygrad (George Hotz) ▷ #learn-tinygrad (5 messages):

Tinygrad performance on T4, tinypilot chatbot, Max tensor numel query


LlamaIndex ▷ #announcements (1 messages):

PapersChat, Deep Research Agent, Multilingual RAG, Invoice Reconciliation Agent, LlamaParse updates


LlamaIndex ▷ #blog (1 messages):

LlamaIndex, Finance, NYC Workshop


LlamaIndex ▷ #general (4 messages):

LlamaIndex Data Loaders vs Data Movement Tools, Customized Data Loaders, Fine-tuning mdcdse-2b


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

MOOC Deadlines, Certificate Requirements


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (3 messages):

Coursework Deadline, AgentX Judging, Homework Verification


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (2 messages):

AI Learning Resources, Best AI Courses


Cohere ▷ #🔌-api-discussions (2 messages):

Token Prepending, Azure SDK Ticket


Cohere ▷ #🤝-introductions (3 messages):

Product Evolve, Canadian-hosted models, RAG capabilities, GenAI experiences, voice and chat agents


MLOps @Chipro ▷ #events (3 messages):

Anthropic, Claude, Updates


Torchtune ▷ #dev (3 messages):

OptinBwd, Llama Tokenizer