Frozen AI News archive

Cursor @ $9b, OpenAI Buys Windsurf @ $3b

**OpenAI** is reportedly close to closing a deal with Windsurf, coinciding with **Cursor's** $900M funding round at a $9B valuation. **Nvidia** launched the **Llama-Nemotron series** featuring models from 8B to 253B parameters, praised for reasoning and inference efficiency. **Alibaba** released the **Qwen3 family** with MoE and dense models up to 235B parameters, ranking highly in coding and math benchmarks. **DeepSeek** introduced **Prover-V2**, an open-source AI for math reasoning with an 88.9% pass rate on MiniF2F-test. **Microsoft** released reasoning-focused **Phi-4 models**, outperforming OpenAI's **o1-mini**. **Baidu** debuted turbo versions of **ERNIE 4.5 and X1** for faster, cheaper inference. **Suno v4.5** added advanced AI music generation features, while **Runway Gen-4 References** enable placing characters into scenes with high consistency. **KerasRS**, a new recommender system library optimized for TPUs, was released by **Fran\0ois Chollet**.

Canonical issue URL

VSCode forks are all you need.

AI News for 5/2/2025-5/5/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (214 channels, and 10768 messages) for you. Estimated reading time saved (at 200wpm): 1105 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

The Windsurf-OpenAI talks have been happening for a few weeks, but after appearing on the o3 livestream and interesting sartorial banter, Bloomberg is reporting that OpenAI has agreed to the deal, though not yet closed. This comes just as Cursor closes its $900m round at a $9b valuation, and OpenAI updates that the nonprofit is staying in control of the for-profit.

A lot of notable takes like this one, and we'll have to wait a while to learn the full blow by blow, but the first "AI Wrapper" unicorn exit is certainly newsworthy.


AI Twitter Recap

Model Releases, Updates, and Features

Agent Based Frameworks and Workflows

Benchmarks, Evaluations and Interpretability

Robotics and Embodied AI

AI and Code

ASR Models

Discussion and Commentary

Memes/Humor


AI Reddit Recap

/r/LocalLlama Recap

1. Qwen3 235B Model Benchmarks and Performance Metrics

2. Multi-Model GPU Orchestration and Hardware for Local LLMs

3. Community Feedback on Model Features and Open Source Licensing

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. ByteDance UI-TARS-1.5, FramePack F1, and Robotics Model/Benchmark Releases

An open-source SOTA multi modal agent built upon a powerful vision-language model. It Surpass OPENAI operator on ALL benchmarks and achieves 42.5% on OSWORLD](https://v.redd.it/pyup2qq3gxye1)****) (Score: 202, Comments: 12): ByteDance released UI-TARS-1.5-7B, a state-of-the-art open source multimodal agent, on Hugging Face. The model is claimed to outperform OpenAI's Operator on all tracked benchmarks and achieves 42.5% on the OSWORLD test, as well as reported 100% on several game environments. The model is available for research and commercial purposes, see the Hugging Face repository for weights and documentation. Technical debate in the comments centers on whether the model can play complex or popular games (e.g., Pokémon), indicating interest in real-world generalization and entertainment applications, though no technical evaluation or benchmarks for those domains are cited in discussion.

- - The original post highlights that UI-TARS-1.5 is an open-source, state-of-the-art multi-modal agent developed by ByteDance, built on a vision-language model architecture. It claims to outperform the OpenAI Operator across all reported benchmarks and achieves a notable `42.5%` on the `OSWORLD` evaluation, implying robust capabilities in tasks requiring both vision and language understanding.

2. Uncanny, Notable and Controversial ChatGPT Behavior and Human Impact

3. Societal, Economic, and Existential Anxiety from AI Acceleration


AI Discord Recap

A summary of Summaries of Summaries by o1-preview-2024-09-12

Theme 1. AI Model Releases and Rivalries Heat Up

Theme 2. AI Tools and Code Generation Evolve

Theme 3. AI Ethics and Censorship Debates Intensify

Theme 4. AI in Medicine and Specialized Fields

Theme 5. AI Research Breakthroughs and Learning

Discord: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (801 messages🔥🔥🔥):

BitNet Optimization, IaC Model Fine-Tuning, Qwen3 Notebook Adaptation, GGUF and Unsloth Compatibility, GRPO Memory Leak


Unsloth AI (Daniel Han) ▷ #off-topic (69 messages🔥🔥):

XDNA driver issues on Arch Linux, GLM4 and updates with Z1, Vast vs Runpod Pricing, Liquid Neural Networks for flappy bird, Gemma3 12b vs Qwen3 14b


Unsloth AI (Daniel Han) ▷ #help (728 messages🔥🔥🔥):

max_grad_norm, scheduler, synthetic examples, OpenAI's sidebars, bleu


Unsloth AI (Daniel Han) ▷ #showcase (4 messages):

U.N. Open-Source Conference and Hackathon, Optimal Value Neural Network project, GroqStreamChain release


Unsloth AI (Daniel Han) ▷ #research (31 messages🔥):

GANs for LLM Fine-tuning, New Physics of LM Paper, Text Classification Notebook Update, Qwen Omni 3B Model Support, Unsloth BERT Model Support


Perplexity AI ▷ #announcements (2 messages):

Claude Sonnet routing, Perplexity WhatsApp, Perplexity Finance, Perplexity Spaces


Perplexity AI ▷ #general (836 messages🔥🔥🔥):

you.com vs gemini, Grok vs Gemini for deepsearch, Perplexity PDF editing, Perplexity not showing reasoning, Gemini 2.5 vs Grok vs ChatGPT for deep research


Perplexity AI ▷ #sharing (2 messages):

lebron, starbase texas


Perplexity AI ▷ #pplx-api (15 messages🔥):

Sonar vs OpenAI web search, Retrieval pipeline integration, Citations Mapping and Titles via Chat Completion API, API Token Creation Issues


LMArena ▷ #general (1197 messages🔥🔥🔥):

Gemini 2.5 Pro Ultra, Grok 3.5, Qwen 3, AI's translation ability, Meta's data collection


LM Studio ▷ #general (675 messages🔥🔥🔥):

IK_llamacpp quants, LM Studio Voice to Voice Implementation, YaRN context stretching, LM Studio API usage, Qwen 3 vs Gemini 2.5


LM Studio ▷ #hardware-discussion (187 messages🔥🔥):

Qwen3 235B A22B /MOE Q3_K_L, Geometric Mean of Total and Active Parameters, Strix Halo, M1 Ultra, MoE Model Performance


OpenAI ▷ #annnouncements (1 messages):

OpenAI board, Public Benefit Corporation, Nonprofit control


OpenAI ▷ #ai-discussions (465 messages🔥🔥🔥):

ChatGPT Word Count, Agents SDK vs Langgraph, Computer Vision Agent, SuperAGI Installation, GPT finetuning


OpenAI ▷ #gpt-4-discussions (39 messages🔥):

o4-mini usage, ethical companion GPT, GPT moderation, Turn off filters


OpenAI ▷ #prompt-engineering (126 messages🔥🔥):

ChatGPT API usage, Amplitude of light wavelengths, Image generation and personalization, Roleplay chatbot prompts, Analyzing scanned books with ChatGPT


OpenAI ▷ #api-discussions (126 messages🔥🔥):

API usage with ChatGPT Free, Amplitude of light wavelengths, Image generation and personalization, Semantic shift modeling, Prompt engineering resources and techniques


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Gemini Flash 2.5 Preview, Thinking tokens


OpenRouter (Alex Atallah) ▷ #app-showcase (6 messages):

Toy.new website builder, AI toggler alternative AI interface, Answerhq.co


OpenRouter (Alex Atallah) ▷ #general (611 messages🔥🔥🔥):

O3 gibberish, Thinking tokens not returned, O3 borked, TPUs, Mistral OCR


Cursor Community ▷ #general (425 messages🔥🔥🔥):

Openlitespeed Cursor issue, GPTs agents file uploads, Claude 3.7 Sonnet Max Cost, Windsurf AI vs Cursor, Memory bank


HuggingFace ▷ #general (170 messages🔥🔥):

Model Serving Frameworks, 3D scene generation from text, Local ML setup, Running LLMs on mobile devices, AI for editing words in a song


HuggingFace ▷ #today-im-learning (13 messages🔥):

AI learning resources, ML study advice, Debugging AI models


HuggingFace ▷ #i-made-this (16 messages🔥):

MiniSetPT Dataset, SimpleFriendlyMath, Ingest-Anything v1.0.0, Rust Transformers Crate, Logcai for VSCode


HuggingFace ▷ #computer-vision (4 messages):

Image Restoration Model, Document Extraction Workflow, Football Player Detection Model, Virtual Try-On Project


HuggingFace ▷ #NLP (6 messages):

Sentiment Analysis for Social Media Filtering, Zero-Shot Classification Challenges, CUDA OOM Errors & Optimization Techniques, FP16 vs BF16 Precision, Model Reliability & Efficiency


HuggingFace ▷ #smol-course (20 messages🔥):

SmolAgents Channels, Gemini API, Claude API, MCP Tools, Qwen3 and Gemma3 models


HuggingFace ▷ #agents-course (156 messages🔥🔥):

Web Search Packages, Youtube question, Submission Issues, Langgraph stucks in recursion, HF Pro Plan


GPU MODE ▷ #general (18 messages🔥):

GB200 NVL72, vLLM GUIs, OpenWebUI, vast.ai compute pricing, A100 vs V100


GPU MODE ▷ #cuda (17 messages🔥):

Cutlass Tutorials, Profiling Kernels on Cloud GPUs, NVIDIA SASS Latency Tables, Upgrading GPU for Unreal Engine 5


GPU MODE ▷ #torch (4 messages):

torch.compile and dynamic=True, FunctionalTensorMode and syncing tensors, Deterministic submodules in compiled modules, Multi-GPU training with YOLO


GPU MODE ▷ #jobs (1 messages):

Play.ai, Inference Engineers, Conversational Voice Interface, Groq LPU partnership


GPU MODE ▷ #beginner (24 messages🔥):

open source medical imaging projects, CUDA programming industry direction, GPU architecture/C++ interview preparation, CUDA certifications, easiest way to rent/access GPU


GPU MODE ▷ #jax (5 messages):

StableHLO to custom IR, JAX Frontend, CUDA Kernels, dlpack Usage


GPU MODE ▷ #torchao (2 messages):

Torch Quantization, LSTM model quantization, Performance Differences GPU vs CPU, TorchAO vs torch.quantization


GPU MODE ▷ #webgpu (4 messages):

WGPU multi-sampling limits, WGSL file


GPU MODE ▷ #self-promotion (4 messages):

MOSS, Minimal On-Device Semantic Search, Affordable GPU Sharing, ComputerUseAgents Reddit


GPU MODE ▷ #edge (3 messages):

Real Time Translation Latency


GPU MODE ▷ #general (1 messages):

ace1984: Hey eveerybody!


GPU MODE ▷ #submissions (173 messages🔥🔥):

MI300 AMD-FP8-MM Leaderboard Submissions, Histogram Leaderboard Submissions, AMD-Identity Leaderboard Submission, Matmul Leaderboard Submissions, AMD-Mixture-of-Experts Leaderboard


GPU MODE ▷ #status (1 messages):

MoE baseline slowness, Pre-computing reference results


GPU MODE ▷ #hardware (3 messages):

DGX Spark, N1X ARM SoC, Blackwell Ultra Compute Capability, RTX Pro Blackwell


GPU MODE ▷ #amd-competition (55 messages🔥🔥):

composable-kernel compilation, MI300 GitHub job failures, Triton kernels for MoE, AI coding assistants, FP16 instability in MoEGate


GPU MODE ▷ #mojo (49 messages🔥):

Mojo Kernels, GPU Module, Colab Environments for Mojo, Mojo on Arch Linux, MAX Serve Model Serving Framework


aider (Paul Gauthier) ▷ #general (240 messages🔥🔥):

aider New-to-Aider documentation, Gary leaving the chat, Gemini's verbosity, Code compression feature request, Claude Code with unlimited usage


aider (Paul Gauthier) ▷ #questions-and-tips (109 messages🔥🔥):

Gemini 2.5 pro, GPTs Agents, OpenAI's sidebars, aider llm history, Copilot Support


Nous Research AI ▷ #announcements (1 messages):

Nous RL Environments Hackathon, Atropos RL framework, Hackathon prize pool, Hackathon partners, Hackathon channel


Nous Research AI ▷ #general (234 messages🔥🔥):

Model understanding of intent, Concept of time in AI, Ilya Sutskever's views on LLMs, Alternatives to Unsloth, Quantizing Qwen3-32b


Nous Research AI ▷ #ask-about-llms (20 messages🔥):

Worldsim vs. Nous Portal, Reinforcement Learning Resources, Scientific research literature and synthesis of knowledge


Nous Research AI ▷ #research-papers (4 messages):

Canon layers, 2D/3D convolution, DiT architecture, quantization quality, speech modality for duplex models


Nous Research AI ▷ #interesting-links (5 messages):

AnySphere, Fundraising


Nous Research AI ▷ #research-papers (4 messages):

Canon Layers, Convolutional Architectures, DiT Architecture for image generation, Duplex Models in Speech Modality


Manus.im Discord ▷ #showcase (2 messages):

``


Manus.im Discord ▷ #general (253 messages🔥🔥):

AI detection tools, digital watermarks, Manus invitation codes, Free credits, LATAM


Yannick Kilcher ▷ #general (201 messages🔥🔥):

LLMs in medicine, implicit vs explicit model learning, American sign language model training, Qwen 3 and QwQ Model, Grok 3.5 is fake


Yannick Kilcher ▷ #paper-discussion (12 messages🔥):

DEoT, AI Text Normalization, ChatGPT o3


Yannick Kilcher ▷ #ml-news (27 messages🔥):

Granite 4.0 Tiny Preview, Mamba-2/Transformer, Adblocker models, California's AI regulation SB-1047, Apple-Anthropic AI coding platform


Notebook LM ▷ #announcements (1 messages):

User Experience Research, Feedback on NotebookLM, Google products feedback, Opportunities


Notebook LM ▷ #use-cases (28 messages🔥):

Podcast length limits, Mind Map feature requests, Audio Overviews, Using NotebookLM for Research, Prompting techniques for NotebookLM


Notebook LM ▷ #general (138 messages🔥🔥):

Gemini 2.5 Flash, Sycophantic AI Behavior, Gemini Upgrade, NotebookLM Audio Generation


Latent Space ▷ #ai-general-chat (98 messages🔥🔥):

MCP Auth Spec, Xcode AI Anthropic, AI Salesperson, Deep Research Reports, Decagon ARR


Latent Space ▷ #ai-in-action-club (49 messages🔥):

A2A vs MCP, Discord Stream Issues, Google's Protocol Background


Modular (Mojo 🔥) ▷ #general (32 messages🔥):

Installing Mojo, Mojo and MAX Bundling, Mojo file extension, UV, Pip, and Mojo Projects, Traits and Fields in Mojo


Modular (Mojo 🔥) ▷ #mojo (115 messages🔥🔥):

constexpr in Mojo, consteval, Function which doesn't exist at runtime, Globals


MCP (Glama) ▷ #general (131 messages🔥🔥):

Claude resources as attachments, Claude limitations on pinning and subscribing, Open Source models vs OpenAI models, Testing Streamable HTTP with MCP Inspector's CLI Mode, PM2 for managing MCP servers


MCP (Glama) ▷ #showcase (16 messages🔥):

MCP Language Server, Biothings MCP, FastMCP Tool Timeouts, Langchain App via SSE, MCP Task Scheduler


Eleuther ▷ #general (37 messages🔥):

LLM Hallucinations, Efficient Jailbreaks, ML Subreddit, Independent Research in AI/ML, Deepseek-R1 GPUs


Eleuther ▷ #research (82 messages🔥🔥):

Weight Decay and Learning Rate Coupling, Catastrophic Forgetting, Softpick Attention, New Physics of LM Paper, LLMs and Knowing


Eleuther ▷ #interpretability-general (21 messages🔥):

RoPE in Transformers, Early Layers in Transformers, Mechanistic Interpretability of Abstract Reasoning


Eleuther ▷ #lm-thunderdome (3 messages):

lm_eval issues, DeepSeek-R1-Distill-Qwen-32B, vllm vs hf inference, gsm8k, mmlu


DSPy ▷ #general (71 messages🔥🔥):

AI Developer Survey, Code Conversion Tool, HuggingFace LM Support, Web3 Game Beta Testers, DSPy.GRPO Release


DSPy ▷ #examples (1 messages):

dbreunig: Nice end-to-end example: https://duarteocarmo.com/blog/evals-are-all-you-need.html


Torchtune ▷ #dev (47 messages🔥):

Agent tests PR, Mask computation bug, Tokenizer support, LLMs


Torchtune ▷ #papers (1 messages):

Physics of LLMs


Torchtune ▷ #rl (3 messages):

CI improvements, New features


LlamaIndex ▷ #blog (5 messages):

O3 vs Claude 3.7 Evaluation, AI SDRs with LlamaParse, RAG production lessons, LlamaIndex Pull Request Agent, Big MCP Hackathon


LlamaIndex ▷ #general (29 messages🔥):

RAG accuracy, NLP API, LlamaIndex Gemini bug, Legacy mainframe code to Cobol, Lovabe Cursor Expert


Nomic.ai (GPT4All) ▷ #general (28 messages🔥):

VRAM vs RAM, PDF upload issues, LaTeX support, Qwen 3 integration, LocalDocs feature


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (15 messages🔥):

Lab Deadlines, Lean-lang.org Issues, Wayback Machine, Network Issues


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (4 messages):

Lecture 6, Multimodal Autonomous AI Agents, AgentX, MCP protocol, LM finetuning


tinygrad (George Hotz) ▷ #general (4 messages):

Meeting #69, get_rewrites_for_renderer, MLPerf submissions, Scheduler Fusion, Driver


tinygrad (George Hotz) ▷ #learn-tinygrad (10 messages🔥):

contiguous method of Tensor, devectorization, Gradient Accumulation with JIT


Cohere ▷ #💬-general (7 messages):

Internal Server Error, Coral and Chat Redirects


Cohere ▷ #🔌-api-discussions (2 messages):

Embed V4, command-r latency


Cohere ▷ #🤝-introductions (3 messages):

AI agent tools, LLM workflows, Full stack AI development, GPT-4o and Claude 3, Collaboration opportunities