Frozen AI News archive

Grok 4 Fast: Xai's distilled, 40% more token efficient, 2m context, 344 tok/s frontier model

**xAI** announced **Grok 4 Fast**, a highly efficient model running at **344 tokens/second**, offering reasoning and nonreasoning modes and free trials on major platforms. **Meta** showcased its neural band and Ray-Ban Display with a live demo that experienced hiccups but sparked discussion on live hardware demos and integration challenges. **Meta** is also developing a first-party "Horizon Engine" for AI rendering and released Quest-native Gaussian Splatting capture tech. New model releases include **Mistral's Magistral 1.2**, a compact multimodal vision-language model with improved benchmarks and local deployment; **Moondream 3**, a 9B-parameter MoE VLM focused on efficient visual reasoning; **IBM's Granite-Docling-258M**, a document VLM for layout-faithful PDF to HTML/Markdown conversion; and **ByteDance's SAIL-VL2**, a vision-language foundation model excelling at multimodal understanding and reasoning at 2B and 8B parameter scales.

Canonical issue URL

xAI is all you need?

AI News for 9/18/2025-9/19/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (192 channels, and 4967 messages) for you. Estimated reading time saved (at 200wpm): 415 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Absent some fake news today that would have put Xai at a higher valuation than Anthropic, xAI announced Grok 4 Fast, the second of its Fast models, and the keyword is efficiency:

Per Artificial Analysis testing it is a good deal faster than the frontier big models at 344 tok/s and just about as capable:

Grok 4 Fast has reasoning and nonreasoning modes and is free to try now on all major routers and AI IDEs.


AI Twitter Recap

Meta’s neural band + Ray‑Ban Display launch: live demo hiccups, engine bets, and capture tech

New models: compact VLMs, reasoning video, doc VLMs, and open video editing

Competitions, coding, and evaluations

Infra, determinism, and training at scale

Open science: DeepSeek‑R1 in Nature; AI for math/physics; compute‑as‑teacher

Agents and developer tooling

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Wan2.2-Animate MoE and Moondream 3 Preview

2. Local AI Tools & Release Roundup (Memori SQL Memory + Sep 19 Weekly List)

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Wan2.2 Animate and Lucy Edit: Open-Source Video Animation Releases

2. Anthropic/Dario Amodei Coverage and xAI Grok 'Survival Mode' Update

3. Classic Film Color Qwen LoRA and AI Photo Generation Showcase


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. New Multimodal & Visual GenAI Models

2. Agentic Coding Models and Knowledge-Work Agents

3. Quantization & Edge Inference: From Labs to Low Orbit

4. Research Highlights: Reasoning, Memorization, and Fluids

5. Open-Source Architectures & Developer Tooling


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


Unsloth AI (Daniel Han) Discord


Cursor Community Discord


OpenRouter Discord


LM Studio Discord


GPU MODE Discord


Eleuther Discord


HuggingFace Discord


OpenAI Discord


Latent Space Discord


Modular (Mojo 🔥) Discord


Yannick Kilcher Discord


aider (Paul Gauthier) Discord


DSPy Discord


Nous Research AI Discord


Moonshot AI (Kimi K-2) Discord


tinygrad (George Hotz) Discord


MCP Contributors (Official) Discord


Windsurf Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1105 messages🔥🔥🔥):

Perplexity Deep Research exhausted, Image Upscaling Tools, Comet and Google Drive connector issues, Perplexity's Sam AI assistant, Comet and Canvas quizzes


Perplexity AI ▷ #pplx-api (3 messages):

Grounded Responses, Bitcoin News, Memory Management


LMArena ▷ #general (872 messages🔥🔥🔥):

Seedream 4 High Res Removal, Gemini 3 Speculation, Brute Force Discussion, LM Arena Login Issues


Unsloth AI (Daniel Han) ▷ #general (484 messages🔥🔥🔥):

Exploding Gradients, Data Cleaning, Aphantasia vs Visual Imagination, Titans Architecture, Google AI Pro


Unsloth AI (Daniel Han) ▷ #off-topic (81 messages🔥🔥):

Meta Horizon Worlds Mobile Competition, AI Minister in Albania, RAM Overclocking Stability Tests, LLM Uncanny Valley, GPTs roleplay


Unsloth AI (Daniel Han) ▷ #help (37 messages🔥):

GLM-4.5V finetuning, NeMo 2.0 framework, GPT-oss errors, Gemma3 27b memory requirements, Swiss German audio datasets


Unsloth AI (Daniel Han) ▷ #research (21 messages🔥):

SLED potential for brain damage prevention, LLM Layer Usage in Inference, Server Supporter Role, Early Exit Logits, SLED Technique


Cursor Community ▷ #general (402 messages🔥🔥):

Auto Model, Cursor CLI, Cursor Terminal issues, MCP for managing projects, Windsurf vs Cursor


Cursor Community ▷ #background-agents (3 messages):

Background Agents, GitHub repository access, configuration issues


OpenRouter ▷ #app-showcase (3 messages):

Voicera Audio Search Engine, SillyTavern iOS Clone


OpenRouter ▷ #general (305 messages🔥🔥):

Responses API Benefits, Kimi K2 0711 Downtime, GPT-4o Alternative, Deepseek V3 429 Errors, Chutes Pricing


OpenRouter ▷ #discussion (7 messages):

Claude popularity, code-supernova model


LM Studio ▷ #general (179 messages🔥🔥):

qwen3-next 8bit on MacOS, Apple MLX and qwen-next, Granite 3.3 8b fine tuning, LM Studio Hub Navigation, gpt-oss 20b performance


LM Studio ▷ #hardware-discussion (65 messages🔥🔥):

Intel ARC demise?, ARM's viability, Xeon Gold prices, VRAM Frequency Boost, DDR5 as VRAM


GPU MODE ▷ #general (5 messages):

Qwen3-Next Architecture, Gated Delta Net, EVGA Software


GPU MODE ▷ #triton (4 messages):

MLIR, Triton, NVVM, NVGPU, GPU Code Generation


GPU MODE ▷ #cuda (26 messages🔥):

TMA Descriptor Modification, ILP in GPUs, Shared Memory Matrix Read, cuTensorMapEncodeTiled API, wgmma usage


GPU MODE ▷ #announcements (1 messages):

Discord milestone


GPU MODE ▷ #cool-links (1 messages):

HUAWEI CONNECT 2025, SuperPoD Interconnect, AI Infrastructure


GPU MODE ▷ #jobs (8 messages🔥):

Nvidia Interview, Byte Pair Encoding, DSA for ML, CUDA for Round 1


GPU MODE ▷ #beginner (11 messages🔥):

OpenSHMEM Optimization, Parallel Programming Resources, Learning GPU Programming with LLMs


GPU MODE ▷ #torchao (1 messages):

TorchAO, PyTorch, Quantization, Phi4-mini-instruct, Qwen3


GPU MODE ▷ #off-topic (1 messages):

Arabic language models, Hala Technical Report


GPU MODE ▷ #rocm (1 messages):

bghira: can't wait for pytorch 2.8


GPU MODE ▷ #metal (2 messages):

Kernel Timeout, Driver-Level Timeout


GPU MODE ▷ #self-promotion (3 messages):

Together AI, Blackwell Deep Dive, Semianalysis, NVIDIA, GPU accelerated compiler


GPU MODE ▷ #edge (14 messages🔥):

NVIDIA Jetson Orin AGX, Earth Observation, Docker in Space, YOLOX, TensorRT


GPU MODE ▷ #reasoning-gym (4 messages):

Reasoning Gym, pass@3, average_mean_score, visualize_results.py, kakurasu and survo


GPU MODE ▷ #gpu模式 (1 messages):

Forwarded messages, Bad English


GPU MODE ▷ #submissions (80 messages🔥🔥):

amd-gemm-rs Leaderboard Updates, MI300x8 Performance, amd-all2all Leaderboard Updates


GPU MODE ▷ #factorio-learning-env (3 messages):

User inactivity, Meeting attendance


GPU MODE ▷ #mojo (1 messages):

Hybrid GPU/CPU inference, Mojo for AMX and CUDA, Mojo/MLIR AMX instruction emission


GPU MODE ▷ #singularity-systems (4 messages):

Vertical Pipeline Setup, Eager vs Lazy Semantics, Tinygrad Autograd, Tensor Fusion Compilers


GPU MODE ▷ #general (1 messages):

krypton_lebg: hi


GPU MODE ▷ #irl-accel-hackathon (25 messages🔥):

Context-Parallel Gated DeltaNet, Hackathon Logistics, Kernel Competitions, Team Approvals


Eleuther ▷ #general (9 messages🔥):

Accessibility solutions for the blind, Screen readers on Windows and macOS, leandojo


Eleuther ▷ #research (186 messages🔥🔥):

Ray + vLLM patching for RL with TorchTitan, Gated Delta Net, TokenSwap at NeurIPS 2025, Fluid Equations, Atlas vs NIAH


Eleuther ▷ #lm-thunderdome (2 messages):

trust_remote_code, downloading dataset


HuggingFace ▷ #general (138 messages🔥🔥):

Local LLM Hardware Advice, Transformers Training Loop Fix, SpikingBrain-7B, Captcha Solving AI, HF API Model Listing


HuggingFace ▷ #i-made-this (3 messages):

Embedder Collection, SmartTaskTool


HuggingFace ▷ #reading-group (1 messages):

olyray: Hello guys. When is the next reading group discussion?


HuggingFace ▷ #smol-course (1 messages):

Kaggle Notebooks, Unit 1 Content, Exercise Notebooks


HuggingFace ▷ #agents-course (4 messages):

Starting the Agents Course, New members joining


OpenAI ▷ #ai-discussions (120 messages🔥🔥):

Codex Use Cases, AI-Assisted Coding Interviews, Ethical Implications of AI in Hiring, Role of AI in Software Development, GPT-4o Mini vs GPT-5


OpenAI ▷ #gpt-4-discussions (7 messages):

GPT-5-chat follow-up questions, ChatGPT memory limitations, Suppressing trailing questions


OpenAI ▷ #prompt-engineering (9 messages🔥):

ChatGPT5 vs Grok, Prompt Generation, API Usage for GPTs


OpenAI ▷ #api-discussions (9 messages🔥):

GPT Prompt Generation, ChatGPT-Grok Style, Custom GPTs actions in standard ChatGPT context


Latent Space ▷ #ai-general-chat (96 messages🔥🔥):

Mistral Small & Medium 1.2, OSS framework for AI Agents, Notion 3.0, Moondream 3, OpenAI job ads


Latent Space ▷ #genmedia-creative-ai (9 messages🔥):

Decart Lucy Edit, Luma AI Ray3, Wan-Animate


Modular (Mojo 🔥) ▷ #mojo (51 messages🔥):

Mojo VS Code Extension, Zed Support, Vim/Neovim Support, Mojo LSP Instability, Exporting Mojo code


Yannick Kilcher ▷ #general (18 messages🔥):

Enterprise redaction solutions, ClaudeAI announcement, Causal inference on Markov chains, Positional encoding with sin and cos, blog platforms Notion vs Jekyll


Yannick Kilcher ▷ #paper-discussion (10 messages🔥):

Ethics Dataset, Qwen3 Next Architecture, Gated Delta Nets, Gated Linear Attention, Paper Recommendation


Yannick Kilcher ▷ #agents (3 messages):

Agent Optimization, Image Superimposition, Background Replacement


Yannick Kilcher ▷ #ml-news (2 messages):

Fluid Dynamics, Google DeepMind


aider (Paul Gauthier) ▷ #general (13 messages🔥):

Coding Agents, Aider as primary coding tool, Fullstack Blockchain Dev


aider (Paul Gauthier) ▷ #questions-and-tips (13 messages🔥):

tree-sitter library in aider, mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit with aider, mlx-lm server with Aider, Expanding context size with mlx-lm


aider (Paul Gauthier) ▷ #links (1 messages):

Graph Usability, Outlier deselect, Data Visualization Improvements


DSPy ▷ #show-and-tell (3 messages):

Human Review Tooling, GEPA integration, Model improvement


DSPy ▷ #papers (1 messages):

batmanosama: https://yifei-he.github.io/mergebench/


DSPy ▷ #general (19 messages🔥):

DSPy ChainOfThought, GEPA Optimization Models, MLFlow Integration, DSPy Server Tag


DSPy ▷ #colbert (2 messages):

ColBERT Long Context, CLS Token Chunking


Nous Research AI ▷ #general (20 messages🔥):

Moondream release, Multimodal models from China, Qwen3-VL


Nous Research AI ▷ #interesting-links (2 messages):

QSilver Quantum Workshop, Quantum Computing Education, Qiskit and Cirq


Moonshot AI (Kimi K-2) ▷ #general-chat (18 messages🔥):

Kimi Researcher, Kimi Dart performance, Kimi free sessions


tinygrad (George Hotz) ▷ #general (1 messages):

Hotz new company, TinyGrad update


tinygrad (George Hotz) ▷ #learn-tinygrad (5 messages):

Stable Diffusion Model, ModuleNotFoundError, PYTHONPATH, extra package


MCP Contributors (Official) ▷ #general (1 messages):

Contributor Server Purpose, Drive-by Surveys, Server Misuse


MCP Contributors (Official) ▷ #general-wg (1 messages):

EmbeddedResource metadata, EmbeddedResource structure