Frozen AI News archive

not much happened today

**Apple** released three real-time vision-language models (**FastVLM**, **MobileCLIP2**) on Hugging Face with significant speed and size improvements, supporting WebGPU and Core ML. Their MLX framework now supports **MXFP4** format, competing with **NVFP4** for FP4 quantization. **xAI** launched **grok-code-fast-1**, outperforming Claude for code edits, while **OpenAI** integrated **GPT-5** into Xcode 26 and released a new **Responses API** on **Groq** hardware. CLI-first agent workflows advanced with tools like **SemTools**, **MLX** local runner for Apple Silicon, and **llama.vim** recommending **Qwen 3 Coder 30B A3B**. Retrieval research highlights limitations of single-vector embeddings, promoting ColBERT-style late interaction.

Canonical issue URL

a quiet day

AI News for 8/28/2025-8/29/2025. We checked 12 subreddits, 544 Twitters and 22 Discords (185 channels, and 7366 messages) for you. Estimated reading time saved (at 200wpm): 574 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

This is not yet publicly announced, but if you are interested in Enterprise AI or Coding Agents, AI News readers can apply to attend the first AI Engineer Code Summit, returning to NYC Nov 20-22 and focused on how coding agents and LLMs are changing (or failing to change) software development at all scales. Speaker and sponsor applications also open.


AI Twitter Recap

Apple’s on-device VLM push (FastVLM, MobileCLIP2) and MLX upgrades

Agentic coding stacks: Grok Code Fast, Codex/Xcode 26, and CLI-native workflows

Retrieval, indexing, and memory: beyond single-vector embeddings

Agent and reasoning evals: multi-hour horizons, tool-use, and environments

Notable model releases and papers (audio, search, vision, reasoning)

Policy, platforms, and ecosystem notes

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Apple FastVLM/MobileCLIP2 WebGPU Demo + Step-Audio 2 Mini Release

2. Qwen3-Coder Local Coding Tutorial + Qwen September Teaser

3. Alibaba Nvidia-Alternative AI Chip + Meta Cancels Behemoth Public Release

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. AI-Generated Trailers and Lip-Sync Workflows

2. Consumer Robotics and Autonomous Vehicle Announcements

3. Realtime Assistant Demos and AI Fitness Tracking Apps


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. New Model & Capability Releases

2. Open-Source Releases & Local Tooling

3. Video Generation: New Tools & Constraints

4. OpenRouter Ecosystem: Performance & Costs

5. GPU & Systems Engineering for LLMs


Discord: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


OpenRouter Discord


LMArena Discord


LM Studio Discord


HuggingFace Discord


Cursor Community Discord


OpenAI Discord


Eleuther Discord


GPU MODE Discord


Nous Research AI Discord


Latent Space Discord


Yannick Kilcher Discord


DSPy Discord


Moonshot AI (Kimi K-2) Discord


tinygrad (George Hotz) Discord


Modular (Mojo 🔥) Discord


Manus.im Discord Discord


aider (Paul Gauthier) Discord


LLM Agents (Berkeley MOOC) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1232 messages🔥🔥🔥):

bird emotes, Brave Browser's ad blocker, OpenAI Deep Research Cost, Sonar Deep Research, Comet browser invite codes


Perplexity AI ▷ #sharing (4 messages):

Perplexity AI, Free Perplexity AI


Perplexity AI ▷ #pplx-api (9 messages🔥):

Perplexity Pro, Free Pro Access, Search-Only API Testing


Unsloth AI (Daniel Han) ▷ #general (1164 messages🔥🔥🔥):

6090 VRAM expectations, GLM & Deepconf, Local 'Claude Code CLI' Emulator, MoLA: Mixture of LoRAs, Gemma's VRAM usage


Unsloth AI (Daniel Han) ▷ #off-topic (275 messages🔥🔥):

IQ tests, qwen3-8b, Qwen 3 instruct, Mistral struggles, llama3-8b


Unsloth AI (Daniel Han) ▷ #help (33 messages🔥):

Qwen 2.5 reasoning capabilities, Aya Vision 8B fine-tuning with Unsloth, Training OSS GPT on self-created dataset, Prompt-completion dataset for SFT, Image token count mismatch during inference


Unsloth AI (Daniel Han) ▷ #research (48 messages🔥):

Tokenization Woes, Crawling Discord Channels, Data Requirements for Fine-tuning Sesame_CSM, AI Brainrot Detox App Research, Latent Space Transformation in Language Models


OpenRouter ▷ #announcements (1 messages):

Sonnet 4, 1 Million Context length


OpenRouter ▷ #app-showcase (6 messages):

Dashboard Code Release, Screenshot Attention, AI Roleplay Site


OpenRouter ▷ #general (938 messages🔥🔥🔥):

stripe refund, Deepseek v3 performance issues, Inference provider onboarding, GPT OSS 120B, GLM 4.5 Air


OpenRouter ▷ #discussion (27 messages🔥):

Defining a turn, OpenAI responses API, Multi-turn chats, Gemini 2.5 Pro, Grok 3


LMArena ▷ #general (761 messages🔥🔥🔥):

oAI streaming, gpt4o update, gpt-realtime, Microsoft AI 1 (MAI-1), Grok Code Fast 1


LM Studio ▷ #announcements (1 messages):

LM Studio 0.3.24, ByteDance/Seed-OSS, markdown tables, markdown code blocks, lmstudio.ai


LM Studio ▷ #general (314 messages🔥🔥):

LM Studio latest update, Token probability offline, MCP agent guide with LMStudio, Finding Model Quantization on HF, Simulating Quantum Computing


LM Studio ▷ #hardware-discussion (49 messages🔥):

M1 Max vs M3 Ultra for LLMs, LM Studio on Windows 7, Using Servers with LM Studio, Intel ARC B60, CPU-Only gpt-oss 120B Performance


HuggingFace ▷ #general (299 messages🔥🔥):

Pytorch Lightning Porting, Wandb vs Tensorboard, RAG setup questions, HF DOS attack, Chinese Reasoning LLM


HuggingFace ▷ #today-im-learning (2 messages):

Torch Audio, Supervised Learning, Confusion Matrix, Logistic Regression, Hyperparameter Tuning


HuggingFace ▷ #i-made-this (11 messages🔥):

Small models and GGUF downloads, Google AIStudio prompt for luanti, MBTI PocketFlow, DeepFX Studio


HuggingFace ▷ #computer-vision (1 messages):

Visual Entailment, VLLM as judge alternative


HuggingFace ▷ #smol-course (1 messages):

AI/ML Engineer Introduction, Freelancer Expertise, AI Solutions Delivered


HuggingFace ▷ #agents-course (1 messages):

ailinndev: thank you!!


Cursor Community ▷ #general (300 messages🔥🔥):

GPT-5 High vs Opus, Cursor billing and usage, Codex CLI vs Cursor, AGENTS.md standardization, Sonnet 3.5 deprecation


Cursor Community ▷ #background-agents (1 messages):

tecnobrat: Hmmmm I don't think BAs use the AGENTS.md file


OpenAI ▷ #ai-discussions (126 messages🔥🔥):

Nano Banana naming origin, Prestashop and AI integration, Image generation issues in GPT Chat, AI in healthcare, Reasoning tokens


OpenAI ▷ #gpt-4-discussions (19 messages🔥):

Emergent Alignment, AGI vs. Advanced NLP, Rogue Agents in Discord, Longform Testing


OpenAI ▷ #prompt-engineering (23 messages🔥):

Stop follow up suggestions, Enhancing prompting for article-style writing, Sora limitations with ISS cupola, Benchmark-class prompt


OpenAI ▷ #api-discussions (23 messages🔥):

Turn off setting, Prompt Enhancements, Sora limitations, ISS cupola, Benchmark-class prompt


Eleuther ▷ #general (14 messages🔥):

NeMo v1 vs v2, IIT Madras research, Cracked people


Eleuther ▷ #research (104 messages🔥🔥):

LLMs in Scientific Research, Neurosymbolic Approaches, Connectionism vs Neurosymbolism, Symmetry in Search Algorithms, Discrete vs Continuous Reasoning


GPU MODE ▷ #general (12 messages🔥):

Colab GPUs, Quantization and Inference optimization, Andrej karpathy's nanogpt, GPU programming for frontier models, ThunderKittens DSL


GPU MODE ▷ #cuda (3 messages):

CUDA version recommendations, CUDA 12.8 vs 13.0


GPU MODE ▷ #torch (26 messages🔥):

TorchTitan Base Repo, Graph Neural Network Code, Flex Attention, Mask Generation Cost, Block Mask Sparsity


GPU MODE ▷ #beginner (10 messages🔥):

NVIDIA Interview Prep, CUDA & Java, AMD Competition entry barrier


GPU MODE ▷ #youtube-recordings (1 messages):

tomeone.a: Hi


GPU MODE ▷ #off-topic (4 messages):

VLM MLsystem papers, Prefil-Decoding Disaggregation, Metallica reggae cover


GPU MODE ▷ #rocm (10 messages🔥):

omniprobe, llvm integration, stochastic PC sampling, mi300x+


GPU MODE ▷ #intel (3 messages):

ANV, Luck


GPU MODE ▷ #self-promotion (3 messages):

pequegrad DL framework, CUDA Streams, Voxel Ray Tracing


GPU MODE ▷ #general-leaderboard (1 messages):

xiaodouzi666: thanks🫡


GPU MODE ▷ #submissions (3 messages):

B200 Speed, MI300 Speed


GPU MODE ▷ #factorio-learning-env (5 messages):

Karpathy Tweet, Will Brown's verifiers, Steam Install, Twitch Stream Preview


GPU MODE ▷ #amd-competition (3 messages):

Registration Confirmation Delays, GEMM Matrix Details


GPU MODE ▷ #general (30 messages🔥):

v2 submissions active, website issues, infra errors, discord bot errors, run info and result


Nous Research AI ▷ #general (103 messages🔥🔥):

Hermes-4, Terminals.tech, Dynamic 2.0 gguf, llama.cpp-toolbox, Flash attention


Nous Research AI ▷ #ask-about-llms (2 messages):

Model Cuteness, Model Personality


Nous Research AI ▷ #research-papers (1 messages):

CODA framework, GUI agents for scientific computing, Cerebrum and Cerebellum, ScienceBoard benchmark


Nous Research AI ▷ #interesting-links (2 messages):

Long Now, Large Scale EP


Nous Research AI ▷ #research-papers (1 messages):

GUI Agents, Scientific Computing, CODA Framework, ScienceBoard Benchmark


Latent Space ▷ #ai-general-chat (48 messages🔥):

Claude's privacy updates, Parsed for custom LLMs, XAI's Grok Model Card, Microsoft MAI Models, Anthropic inference costs


Latent Space ▷ #genmedia-creative-ai (8 messages🔥):

Parsed launch, Krea AI real time video beta


Latent Space ▷ #ai-in-action-club (2 messages):

Password Finding


Yannick Kilcher ▷ #general (38 messages🔥):

Paper Discussions, Text Classification Models, ModernBERT Fine-tuning, Kernel Methods vs. Neural Nets, Nvidia's Nemotron Paper


Yannick Kilcher ▷ #paper-discussion (2 messages):

USO Model Release, Bytedance Research


Yannick Kilcher ▷ #ml-news (10 messages🔥):

GPT-OSS 20b, Promptlock, Ollama API, GPT Realtime, ESET's observations


DSPy ▷ #general (31 messages🔥):

DSPy and MLflow, DSPy program to make sense of a website, DSPy vs prompt optimization, Context7 supports DSPy, Generalizable signatures


Moonshot AI (Kimi K-2) ▷ #general-chat (22 messages🔥):

Kimi TestFlight, Z.AI AMA, GLM-4.5 Air, AI BYD, Qwen Chat


tinygrad (George Hotz) ▷ #general (4 messages):

Numpy Removal, AMD Performance


tinygrad (George Hotz) ▷ #learn-tinygrad (11 messages🔥):

buffer ID change in debugger, BEAM OOM tricks, multiprocessing memory leaks, kernel saving and offline BEAM search


Modular (Mojo 🔥) ▷ #general (2 messages):

Modular Meetup


Modular (Mojo 🔥) ▷ #mojo (4 messages):

Async memory allocation, Mojo async functions


Modular (Mojo 🔥) ▷ #max (2 messages):

Bazel Cache, Pipelines Script, PermissionError


Manus.im Discord ▷ #general (6 messages):

Mail Manus Feature on Zapier, Alternatives Better Trial Systems and Fair Prices, Pricing and Credit System Unfair, Rating +100 Credits Feature Removed


aider (Paul Gauthier) ▷ #general (5 messages):

Local Gemini Emulation, Aider benchmark merge failure, Migration path alwaysn8n