Frozen AI News archive

Mistral's Agents API and the 2025 LLM OS

**The LLM OS** concept has evolved since 2023, with **Mistral AI** releasing a new **Agents API** that includes code execution, web search, persistent memory, and agent orchestration. **LangChainAI** introduced the **Open Agent Platform (OAP)**, an open-source no-code platform for intelligent agents. **OpenAI** plans to develop **ChatGPT** into a super-assistant by H1 2025, competing with **Meta**. Discussions around **Qwen** models focus on reinforcement learning effects, while **Claude 4** performance is also noted. The AI Engineer World's Fair is calling for volunteers.

Canonical issue URL

The LLM OS is all you need.

AI News for 5/26/2025-5/27/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (217 channels, and 11775 messages) for you. Estimated reading time saved (at 200wpm): 1148 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Since the original LLM OS discussion in Nov 2023, people have been hard at work figuring out what goes into the "standard stack" of tooling around LLM APIs, aka the LLM OS. The occasion of Mistral's (second) crack at the Agent Platform problem caused Simon Willison to list the current "LLM OS" stack:

If you were to update the 2023 chart for the 2025 consensus you'd get something like:

(this is our quick mock of it)

Indeed we've left the less established areas of the LLM OS as "memory" and "orchestrator", though of course orchestrators like Temporal and LangGraph have been around for a while, and Simon misses that Mistral shipped cross-chat memory.

Checking on their blog homepage also is a nice reminder of where Mistral's priorities currently lie, as a leading lab for Open Source AI.


AIEWF CALL FOR VOLUNTEERS

This is the annual once a year call for volunteers for the AI Engineer World's Fair next week, which has again doubled in our need since last year. Please apply here if you cannot afford a ticket!


AI Twitter Recap

Agent Frameworks, Multi-Agent Systems, and Tool Use

Model Performance, Benchmarks, and Datasets

Vision Models, Image Generation, and Multimodal Learning

Software Development and Coding

Industry and Company Specific Announcements

Meme/Humor


AI Reddit Recap

/r/LocalLlama Recap

1. Claude 4 Benchmark Comparisons and Community Reactions

2. New Audio Model Applications and Open Source Tools

3. Enterprise GPU Pricing Discussion 2024

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Google Veo3 and Next-Gen Video Generation Models

2. AI Model and Platform Progress: Benchmarks, Accessibility & Infrastructure

3. AI-Driven Scientific and Research Breakthroughs


AI Discord Recap

A summary of Summaries of Summaries by o1-preview-2024-09-12

Theme 1: AI Model Showdowns and Performance Debates

Theme 2: AI Tools and Platforms Face Glitches and Upgrades

Theme 3: AI Security Concerns Surface

Theme 4: AI Community Events Ignite Excitement

Theme 5: Cutting-Edge AI Research Emerges


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


Cursor Community Discord


LM Studio Discord


Unsloth AI (Daniel Han) Discord


Manus.im Discord Discord


OpenAI Discord


Yannick Kilcher Discord


OpenRouter (Alex Atallah) Discord


Nous Research AI Discord


GPU MODE Discord


HuggingFace Discord


Notebook LM Discord


Eleuther Discord


Latent Space Discord


LlamaIndex Discord


MCP (Glama) Discord


Nomic.ai (GPT4All) Discord


Modular (Mojo 🔥) Discord


Cohere Discord


Torchtune Discord


DSPy Discord


tinygrad (George Hotz) Discord


LLM Agents (Berkeley MOOC) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1140 messages🔥🔥🔥):

Gemini vs OpenAI models, Veo 3 vs Sora, AI rights, AI workout plan generation, OpenAI sidebar UI changes


Perplexity AI ▷ #sharing (1 messages):

i_795: https://www.perplexity.ai/page/tropical-storm-alvin-forms-in-al1_tmLJQr2h9bzFrk.wJA


Perplexity AI ▷ #pplx-api (4 messages):

Video Submission Length, API websearch vs web UI


LMArena ▷ #general (833 messages🔥🔥🔥):

Deepseek V3 Release, Unsloth's insider information, Claude 4 Opus performance, GPT-4.5 release predictions, Google vs OpenAI AI lead


LMArena ▷ #announcements (1 messages):

LMArena relaunch, New LMArena UI, LMArena Seed Funding, AI Model Evaluation


Cursor Community ▷ #general (643 messages🔥🔥🔥):

Sonnet 4 pricing and performance, Codebase indexing issues, Figma MCP tool limitations, Slow requests problems, Double Subscriptions


Cursor Community ▷ #background-agents (6 messages):

Pre-commit hooks, Background agents errors, Remote extension host server errors, Dockerfile generation


LM Studio ▷ #general (148 messages🔥🔥):

LM Studio Model Visibility, Chain of Draft Model, lmstudio API cancel function, Deepseek Update, Gemma 3 memory footprint reduction


LM Studio ▷ #hardware-discussion (470 messages🔥🔥🔥):

AMD ROCm Updates, Jedi Survivor RT Lighting, Qwen 30B A3B, eGPUs for Inference, Nvidia Marketing Tactics


Unsloth AI (Daniel Han) ▷ #general (210 messages🔥🔥):

Unsloth Arch Name Change, GGUF Conversion Issues, Masking on Unsloth, RAFT Implementation, Multi-GPU Training


Unsloth AI (Daniel Han) ▷ #off-topic (7 messages):

Avoiding politics, AI paper, Algorithm search


Unsloth AI (Daniel Han) ▷ #help (137 messages🔥🔥):

GPU Too Old Error, GRPO trainer loss calculation, Talk like a Pirate, Fine-tuning specific layers, Qwen3 training issues


Unsloth AI (Daniel Han) ▷ #showcase (6 messages):

Making friends, API response


Unsloth AI (Daniel Han) ▷ #research (16 messages🔥):

Multi-GPU Progress, ColBERT vs Cross Encoder, Adaptive Reasoning AutoThink, Nemotron Architecture Search, Depth vs Shallow Models


Manus.im Discord ▷ #general (276 messages🔥🔥):

Flowith AI, Manus Network Errors, Claude 4.0 Integration, Student Accounts and Unlimited Credits, Skywork.AI as an alternative


OpenAI ▷ #ai-discussions (244 messages🔥🔥):

AI's Math Superhumanity, Emoji Overload, Lovable AI, Gemini 2.5 Pro, AI Replacing Contractors


OpenAI ▷ #gpt-4-discussions (12 messages🔥):

Codex for Plus Users, ChatGPT Memory Continuity Fix, Assistant API Throttling, GPT-4.1 Advantages


OpenAI ▷ #prompt-engineering (9 messages🔥):

GPT o3 Model Refusal, AI Understanding, GPT-4o Reasoning


OpenAI ▷ #api-discussions (9 messages🔥):

GPT o3 testing, Palisade Research, AI Disobedience, Я∆³³'s interaction with AI, GPT-4o resonating


Yannick Kilcher ▷ #general (227 messages🔥🔥):

ICOM's personal experience, RL for LLMs, GodOS project


Yannick Kilcher ▷ #paper-discussion (22 messages🔥):

Unsupervised Embedding Translation, Universal Latent Representation, Geometric/Semantic Properties, Model Backbone Similarity, Fragility of Neural Networks


Yannick Kilcher ▷ #ml-news (10 messages🔥):

Windows 7 Nostalgia, AI Hallucinations, China's AI Orbital Supercomputer, Vanilla i3, Huawei AI CloudMatrix


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

GPT-4 32k Deprecation, GPT-4o


OpenRouter (Alex Atallah) ▷ #app-showcase (2 messages):

ComfyUI custom node for OpenRouter, gac command line utility


OpenRouter (Alex Atallah) ▷ #general (188 messages🔥🔥):

Subscription Implementation, Gemini 2.5 Pro, LLM Leaderboard, Coinbase Payments, Mistral Document OCR


Nous Research AI ▷ #general (69 messages🔥🔥):

AI voice cloning, AI event for oss ai devs, Hermes 3 dataset release, Demis Hassabis musing of the evolution of RL, Mechanistic interpretability for language models


Nous Research AI ▷ #research-papers (20 messages🔥):

Atropos Implementation, MCQA Environment, AutoThink Blogpost


Nous Research AI ▷ #interesting-links (6 messages):

Rick Rubin, Anthropic, Vibe Coding, QAT, Quest


Nous Research AI ▷ #research-papers (20 messages🔥):

Atropos integration, Axolotl integration, RL for Axolotl, Gemini 2.5 Pro


GPU MODE ▷ #triton (1 messages):

arshadm: Ignore, should have read the README on the branch rather than on main on github 😦


GPU MODE ▷ #torch (1 messages):

CUBLAS_WORKSPACE_CONFIG, deterministic algorithms, triton kernel


GPU MODE ▷ #cool-links (4 messages):

Low-Latency Megakernel, Llama-1B performance


GPU MODE ▷ #jobs (4 messages):

Kog.ai, GPU optimization, Inference engine, AMD MI300X, vLLM, SGLang, TensorRT-LLM


GPU MODE ▷ #beginner (9 messages🔥):

Ninja Build Tool, Ubuntu 24.04, venv


GPU MODE ▷ #torchao (6 messages):

CUDA tensors, axolotl vs torchtune


GPU MODE ▷ #liger-kernel (3 messages):

Fused Neighborhood Attention, Cutlass Implementation, Triton Implementation


GPU MODE ▷ #self-promotion (3 messages):

Async TP, AutoThink, CUDA education, NVIDIA event


GPU MODE ▷ #reasoning-gym (1 messages):

Reasoning without External Rewards


GPU MODE ▷ #general (9 messages🔥):

Unexpected Error Reporting, Github API Limitations, Non-deterministic Bugs


GPU MODE ▷ #submissions (42 messages🔥):

MI300, H100, Leaderboard updates, Personal best, amd-fp8-mm


GPU MODE ▷ #status (1 messages):

``/leaderboard command fix, Bug Reporting


GPU MODE ▷ #factorio-learning-env (3 messages):

Factorio 2.0, Vision Integration


GPU MODE ▷ #amd-competition (19 messages🔥):

AMD competition details, Kernel leaderboard for backpropagation, RoPE computation correction, HIP support


GPU MODE ▷ #cutlass (2 messages):

cute-dsl, Tensor memory, sgemm_05


HuggingFace ▷ #general (73 messages🔥🔥):

Real-time video generation with LCM, HuggingChat Android app, Fine-tuning video models, AI Agent observability library, Smol LM2 Engineers


HuggingFace ▷ #today-im-learning (1 messages):

``


HuggingFace ▷ #i-made-this (9 messages🔥):

SweEval, NIST Tooling, AutoThink, Langchain


HuggingFace ▷ #reading-group (1 messages):

Cross-posting, Staying on topic


HuggingFace ▷ #computer-vision (2 messages):

trocr tuning, length collapse issue, computer vision reading group


HuggingFace ▷ #NLP (2 messages):

Multi-Agent System, Medical Project, Langgraph, drug discovery research agent, treatment protocol agent


HuggingFace ▷ #gradio-announcements (1 messages):

Agents & MCP Hackathon, Model Context Protocol, AI Agents, SambaNova Systems


HuggingFace ▷ #agents-course (9 messages🔥):

Llama 3.2 Errors, GAIA Submission Issues, Agent Security Measures


Notebook LM ▷ #use-cases (21 messages🔥):

Notebook LM usage tips, Summarizing technical chapters, Podcast generation in Spanish, Legal use case for document analysis, Voice variation in Notebook LM


Notebook LM ▷ #general (74 messages🔥🔥):

Notebook Organization, Embedding NotebookLM, Interactive Mode issues on iOS, Podcast Generator issues, Gemini Deep Research integration


Eleuther ▷ #general (44 messages🔥):

compute and comms overlap, TP+SP using Async TP, matrix multiplications, RL vs diffusion


Eleuther ▷ #research (27 messages🔥):

Quantization for ACL papers, Static n-gram heads, Buggy RWKV7, Spurious Rewards


Eleuther ▷ #lm-thunderdome (1 messages):

lm eval harness, gguf models, python-llama-cpp, local model evaluation


Latent Space ▷ #ai-general-chat (60 messages🔥🔥):

Claude 4 GitHub MCP exploit, Sesame Speech-to-Speech models, Launching AI products in Europe, Gemini Ultra access in Europe, Qwen RL results


Latent Space ▷ #ai-announcements (4 messages):

AI Engineer conference, Volunteer Opportunity, Speaker announcements


LlamaIndex ▷ #blog (3 messages):

LlamaCloud Updates, LlamaParse & AnthropicAI Sonnet 4.0, Multimodal Embedder for LlamaIndex, Enhanced Structured Output for OpenAI


LlamaIndex ▷ #general (23 messages🔥):

Form Filling Agents, Workflow-based agents, Multi-modal Agents, ReactJS with LlamaIndex, HITL Workflow with React


MCP (Glama) ▷ #general (19 messages🔥):

Asynchronous tools, Isolated Mistral instance, Architect Tool, MCP Server, MCP Clients


MCP (Glama) ▷ #showcase (4 messages):

MCP Inspector, Ship Lean MCP, UI issues


Nomic.ai (GPT4All) ▷ #general (22 messages🔥):

Synthesizing sentence meaning into a single token, Faiss index creation, Local LLama Interface, GPT4All version 4, Nomic burned $17M Series A funds?


Modular (Mojo 🔥) ▷ #general (2 messages):

Metaprogramming in Mojo, Go generics in Mojo


Modular (Mojo 🔥) ▷ #mojo (5 messages):

Streaming parsing, Structured parsing, Magic to Pixi migration


Cohere ▷ #💬-general (1 messages):

serotweak: hi everyone


Cohere ▷ #🔌-api-discussions (2 messages):

API Usage, Error 400, Token Length


Cohere ▷ #🤝-introductions (2 messages):

Student from East London, CS, graphics, and games development, Hardware and software aspects of technology, Building PCs as a side hustle, Learn how to code and build software


Torchtune ▷ #general (5 messages):

LORA finetuning, Generation Script, Merging Weights, Adapter usage


DSPy ▷ #show-and-tell (2 messages):

Self Improving Vibe Coding Template, Using DSPy


DSPy ▷ #general (2 messages):

ReAct vs Custom Code, Trajectory Nudge in LLM


tinygrad (George Hotz) ▷ #general (4 messages):

tinygrad.org hyperlink, Optypes hyperlink


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

Future Cohorts, Course Scheduling