Frozen AI News archive

SmolLM3: the SOTA 3B reasoning open source LLM

**HuggingFace** released **SmolLM3-3B**, a fully open-source small reasoning model with open pretraining code and data, marking a high point in open source models until **Olmo 3** arrives. **Grok 4** was launched with mixed reactions, while concerns about **Claude 4** nerfs and an imminent **Claude 4.1** surfaced. **Gemini Nano** is now shipping in **Chrome 137+**, enabling local LLM access for **3.7 billion** users. **Tencent** introduced **Hunyuan-A13B**, an 80B parameter model with a 256K context window running on a single **H200** GPU. The **Gemini API** added a batch mode with 50% discounts on **2.5 models**. **MatFormer Lab** launched tools for custom-sized **Gemma 3n** models. Open source OCR models like **Nanonets-OCR-s** and **ChatDOC/OCRFlux-3B** derived from **Qwen2.5-VL-3B** were highlighted, with licensing discussions involving **Alibaba**.

Canonical issue URL

open source AI is deeply needed.

AI News for 7/7/2025-7/8/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (223 channels, and 5116 messages) for you. Estimated reading time saved (at 200wpm): 491 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

HuggingFace's small model work is underrated but had their day in the sun today with SmolLM3 , a very capable small reasoning model with their own "upper left triangle" graphic that works if you don't squint too hard at the y axis:

a more normalized view of evals is just below, giving Qwen 3 more credit:

But where Qwen is just open weights, SmolLM is truly open source, pretraining code, data and all:

The data section is particularly impressive given how HuggingFace (with collaborators) has had to slowly build this up over the last 2 years:

making this possible:

This is likely the high water mark in fully open source models until Olmo 3 comes out next.


AI Twitter Recap

AI Model Releases, Performance, and Benchmarking

AI Agent & Developer Tooling

Infrastructure, Efficiency, and Hardware

New AI Techniques & Research

Industry, Companies, and Broader Implications

Humor & Memes


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Recent Small-Scale and Reasoning-Oriented LLM Model Releases

2. AI Tools and Local Model Deployment Experiences (LM Studio, Mac Studio, Gemma)

3. Model Integration, Security Benchmarks, and AI Hardware Announcements

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Claude Code and AI Workflow Adoption Experiences

2. Wan2.1 Model Usages and Workflow Innovations

3. Humor and Memes on AI Model Interactions


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1. The AI Model Horse Race Intensifies

Theme 2. Dev Tools See Both Growing Pains and Gains

Theme 3. The Relentless Pursuit of Performance

Theme 4. The Model Context Protocol (MCP) Ecosystem Matures

Theme 5. Pushing the Theoretical and Ethical Boundaries of AI


Discord: High level Discord summaries

OpenAI Discord


Unsloth AI (Daniel Han) Discord


Cursor Community Discord


LMArena Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Eleuther Discord


Yannick Kilcher Discord


Latent Space Discord


GPU MODE Discord


HuggingFace Discord


Modular (Mojo 🔥) Discord


Notebook LM Discord


MCP (Glama) Discord


tinygrad (George Hotz) Discord


aider (Paul Gauthier) Discord


Cohere Discord


DSPy Discord


Nous Research AI Discord


LlamaIndex Discord


Torchtune Discord


Manus.im Discord Discord


Nomic.ai (GPT4All) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

OpenAI ▷ #ai-discussions (683 messages🔥🔥🔥):

GPTs Agents, OpenAI's sidebars, Model Merging, Open Empathic, AI Image Generation


OpenAI ▷ #gpt-4-discussions (8 messages🔥):

Patent data for pre-training, GPT script generation consistency, Image upload channels, Research depth comparison between GPT models, GPT for fixing product spreadsheets


OpenAI ▷ #prompt-engineering (11 messages🔥):

AI Affirmation in Prompt Engineering, LLM Prompt Optimisation Loops, Symbolic Reasoning in LLMs, Character Creation Agent Design


OpenAI ▷ #api-discussions (11 messages🔥):

Prompt Engineering, LLM Memory, Character Creation, Prompt Optimization


Unsloth AI (Daniel Han) ▷ #general (580 messages🔥🔥🔥):

Local LLM for game characters, Light novel translation, Training Llama on Unsloth, GPTs agents training, 4GB VRAM LLMs


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

MI50 Comparison, B580 Performance


Unsloth AI (Daniel Han) ▷ #help (101 messages🔥🔥):

vLLM Serving Gemma 3 GGUF, SFT Input Token Truncation, GROTrainer AttributeError, Qwen2.5-7B Fine-tuning on A100, Gemma 3N on CPU


Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

SmolLM3, Unsloth finetuning


Unsloth AI (Daniel Han) ▷ #research (2 messages):

OpenCodeReasoning-Nemotron-1.1-32B, coding dataset, Nvidia models


Unsloth AI (Daniel Han) ▷ #unsloth-bot (63 messages🔥🔥):

group_by_length, ZeroDivisionError, conda environment setup, early stopping, A100 GPU training efficiency


Cursor Community ▷ #general (620 messages🔥🔥🔥):

Cursor Pro Plan Limits, Claude Code vs Cursor, Cursor indexing issues, Max Mode Rate Limits


Cursor Community ▷ #background-agents (28 messages🔥):

CLI installation, .devcontainer/Dockerfile reuse, Background agent install script failure, Secret environment variables, GitHub access permissions


LMArena ▷ #general (586 messages🔥🔥🔥):

Grok 4, Polymarket manipulation, Google Gemini app vs AI Studio, LLM model collapse, O3


LMArena ▷ #announcements (1 messages):

July Contest, Image Edit Leaderboard, Out of Place Objects in Space Theme, June Contest Winner


OpenRouter (Alex Atallah) ▷ #general (263 messages🔥🔥):

Automod bypass role, Grok 4 arrival, Vertex-AI integration, Cerebras context length increase, Free money methods


OpenRouter (Alex Atallah) ▷ #new-models (2 messages):

``


OpenRouter (Alex Atallah) ▷ #discussion (1 messages):

soflowsen: UwU


LM Studio ▷ #announcements (1 messages):

LM Studio Licensing, Local AI Access, Privacy Focus


LM Studio ▷ #general (117 messages🔥🔥):

Token Speed Decrease with Larger Context, Coding Help Model Recommendations, LM Studio Docker Image, Adding Web Search API to LM Studio, Therapy Models


LM Studio ▷ #hardware-discussion (24 messages🔥):

GPU upgrade, RTX 3090 for AI, RTX 5060 Ti release, Multiagent framework testing


Eleuther ▷ #general (18 messages🔥):

GLoVE Symmetry, Self-hood in LLMs, Stack Overflow AI Training Survey, Emergent Misalignment in LLMs


Eleuther ▷ #research (25 messages🔥):

LLaDa vs MaskGIT, Predictive Coding, Nvidia's OpenCodeReasoning, ByteDance Image VQ


Eleuther ▷ #interpretability-general (5 messages):

Sparse Autoencoder Expansion Factor, Video Prediction Models


Eleuther ▷ #lm-thunderdome (3 messages):

lm_eval command, Sample processing with seed and limit


Eleuther ▷ #gpt-neox-dev (21 messages🔥):

TransformerEngine with FA3, NVIDIA's TransformerEngine as Apex Replacement, Dataset Tooling with TokenSmith


Yannick Kilcher ▷ #general (30 messages🔥):

Orthogonal Context Vectors, Monarch Attention, Legendre Memory Unit (LMU), Test Time Training, LLM Bug Pattern


Yannick Kilcher ▷ #paper-discussion (29 messages🔥):

Quaternion products in LLMs, HRM discussion TLDR, Astro's Paper


Yannick Kilcher ▷ #ml-news (5 messages):

ChatGPT Fake Feature, Mistral Large 3


Latent Space ▷ #ai-general-chat (63 messages🔥🔥):

Cursor Pricing Model, xmcp TypeScript Framework, Grok 4 Livestream, AI Mandate of Heaven Tier List, Veo 3 Image-to-Video


GPU MODE ▷ #general (1 messages):

Job search, Cool use of time


GPU MODE ▷ #triton (1 messages):

NVFP4 support on RTX 5090, MXFP4 functionality, Debugging Crashes with NVFP4


GPU MODE ▷ #cuda (8 messages🔥):

VSCode debugging, TMA descriptor, Volatile variables, Cutlass repo


GPU MODE ▷ #beginner (14 messages🔥):

CUDA kernel performance, CUDA book recommendations, CS fundamentals for coding, PMPP Book


GPU MODE ▷ #rocm (12 messages🔥):

Make Flags for AMD GPUs, LLVM optimization flags, Loop Unrolling, ISA for Instinct GPUs


GPU MODE ▷ #lecture-qa (1 messages):

LMCache, GPU MODE Discussions


GPU MODE ▷ #self-promotion (1 messages):

Deep Infra, B200 instances


GPU MODE ▷ #🍿 (4 messages):

LLM Kernel Generation, KernelBot Data


GPU MODE ▷ #submissions (3 messages):

H100 Results, B200 Results, trimul, grayscale_py_b200-dev


GPU MODE ▷ #status (3 messages):

Triton Leaderboard Templates, grayscale py b200


GPU MODE ▷ #factorio-learning-env (6 messages):

Multi-agent FLE, Local Factorio Models, Automatic Design of Factorio Blueprints


GPU MODE ▷ #amd-competition (1 messages):

GPUMODE dataset, kernelbot-data


GPU MODE ▷ #cutlass (1 messages):

soniczun: Hi! When I debug in vscode, some variables show , how to solve it?🙋‍♂️


HuggingFace ▷ #general (22 messages🔥):

A100/H100 Rental, Runpod Experiences, Lambda Labs experiences, Finding study partners, YOLOv11 deployment


HuggingFace ▷ #today-im-learning (2 messages):

GPT-2 Kernel, GPT-NeoX Port to C


HuggingFace ▷ #i-made-this (5 messages):

Arena-RLHF Open Source Release, MCP YouTube Analysis Kit, PsychKG: Psychology Knowledge Graph


HuggingFace ▷ #computer-vision (1 messages):

SmolVLM2, AI2D, Model Performance Variations


HuggingFace ▷ #NLP (1 messages):

GLoVE Model Symmetry


HuggingFace ▷ #agents-course (16 messages🔥):

Axon chatbot, Meta Llama 3.2 pending access, Hacking expert offers services, AI Agents Certification, CS50 Harvard & Stanford courses


Modular (Mojo 🔥) ▷ #general (17 messages🔥):

Nabla Github Repo, Modular Release Cadence, Mojo Roadmap Update, Mojo on Windows, MAX on Intel Ultra 7 GPUs/NPUs


Modular (Mojo 🔥) ▷ #mojo (14 messages🔥):

GPU programming model, 3D Block Fitting, CUDA


Notebook LM ▷ #use-cases (12 messages🔥):

Space Repetition, Podcast for B1, Conversation function, Youtube for Learning, NotebookLM Tripping


Notebook LM ▷ #general (15 messages🔥):

Customize Audio Button Gone, Transforming YouTube into Learning System, Missing Customization Options, Deleting Multiple Sources, Official API Release


MCP (Glama) ▷ #general (17 messages🔥):

MCP Server Utility, Paid MCP Servers, Tracking MCP Usage, Influencing ChatGPT with MCP, API Routing Layer for MCP


MCP (Glama) ▷ #showcase (9 messages🔥):

AI Agents with MCP Early Release, Framework Recommendations for End-to-End Agents, Tree Sitter MCP Rewrite in TypeScript, Typed MCP SDK Types & Zod Schema


tinygrad (George Hotz) ▷ #general (24 messages🔥):

Tinygrad's Edge, Halide vs MLIR vs TVM, Exo-lang alternative


aider (Paul Gauthier) ▷ #general (22 messages🔥):

AI Model Benchmarking Realities, Affordable and Fast Models for Aider, Aider Dataset for Training, Claude Code's Hooks


aider (Paul Gauthier) ▷ #questions-and-tips (2 messages):

git subrepos, aider limitations, Hugo websites, git submodules, vendor sub repository


Cohere ▷ #🧵-general-thread (7 messages):

AI safety program, ML Understanding, Cohere Labs server, Open Science Initiative Application


Cohere ▷ #🔌-api-discussions (8 messages🔥):

Embed v4, Image tokens, Negative feedback


Cohere ▷ #👋-introduce-yourself (4 messages):

AI Consultant Introduction, ML Learning Path Guidance


DSPy ▷ #general (17 messages🔥):

DSPy 3.0, Data and AI summit, Fast Inverse Square Root origin, SIMBA


DSPy ▷ #examples (1 messages):

hammer_mt: I think just poking around signature.py


Nous Research AI ▷ #general (5 messages):

Node Shifting, Result Interpretation, Reverse Commonality


Nous Research AI ▷ #ask-about-llms (5 messages):

Temperature and Token Count, r1-0528 weak influence


Nous Research AI ▷ #research-papers (3 messages):

Arxiv Papers, Image analysis, PDFs


Nous Research AI ▷ #research-papers (3 messages):

Image Analysis, Arxiv papers


LlamaIndex ▷ #announcements (1 messages):

MCP Office Hours


LlamaIndex ▷ #blog (2 messages):

LlamaCloud MCP Servers, Agent Workflows as MCP, MCP Tools


LlamaIndex ▷ #general (10 messages🔥):

LlamaParse Text Field Removal, Django Prompt Management, Haystack Prompt Implementation, Langfuse API for Prompt Metadata


Torchtune ▷ #papers (5 messages):

MoE training, Linear scaling drawbacks


Manus.im Discord ▷ #general (5 messages):

Discord Bot, Manus Access


Nomic.ai (GPT4All) ▷ #general (2 messages):

English Language, tenor.com