Frozen AI News archive

Voxtral - Mistral's SOTA ASR model in 3B (mini) and 24B ("small") sizes beats OpenAI Whisper large-v3

**Mistral** surprises with the release of **Voxtral**, a transcription model outperforming **Whisper large-v3**, **GPT-4o mini Transcribe**, and **Gemini 2.5 Flash**. Voxtral models (3B and 24B) support **32k token context length**, handle audios up to **30-40 minutes**, offer built-in **Q&A and summarization**, are **multilingual**, and enable **function-calling** from voice commands, powered by the **Mistral Small 3.1** language model backbone. Meanwhile, **Moonshot AI**'s **Kimi K2**, a non-reasoning **Mixture of Experts (MoE)** model built by a team of around **200 people**, gains attention for blazing-fast inference on **Groq** hardware, broad platform availability including **Together AI** and **DeepInfra**, and local running on **M4 Max 128GB** Mac. Developer tool integrations include **LangChain** and Hugging Face support, highlighting Kimi K2's strong tool use capabilities.

Canonical issue URL

Mistral back in open models land!

AI News for 7/14/2025-7/15/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (226 channels, and 5884 messages) for you. Estimated reading time saved (at 200wpm): 486 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

While Mira's $2b Thinking Machines fundraise was relatively well telegraphed, Mistral came from out of nowhere to drop Voxtral, their new transcription model that "comprehensively outperforms Whisper large-v3" and "beats GPT-4o mini Transcribe and Gemini 2.5 Flash across all tasks":

We love a good no-qualifications necessary beating, and even better when it is an open model.

Both Voxtral 3B and Voxtral 24B models go beyond transcription with capabilities that include:

Very exciting. We skipped reporting on their Magistral reasoning model (which turned out to have an EXCELLENT paper) but we're pretty sure Voxtral will be in production almost immediately...


AI Twitter Recap

Kimi K2's Emergence and Performance

New Models: Speech, Motion Capture, and AI Companions

Tooling, Infrastructure, and Development

Research, Evaluation, and AI Safety

Company Strategy and the Industry Landscape

Humor, Memes, and Culture


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Kimi K2 Model Benchmarks, API Access, and Community Memes

2. AI Model Launches and Infrastructure Milestones (Meta, EXAONE, Voxtral, Llama 4)

3. AI Usage Trends, Community Analysis, and Local Inference Memes

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Grok 4 and xAI Waifu/NSFW Controversy & Satire

2. Recent AI Model Benchmarks, Leaderboards & Comparisons

3. Glow in the Dark Fruits Meme Evolution


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. LLM Performance, Comparisons, and Quirks

Theme 2. Model Training, Fine-tuning, and Deployment Challenges

Theme 3. AI Development Tools and Platform Integrations

Theme 4. Hardware and GPU Optimization for AI

Theme 5. The Evolving Landscape of Open Source AI


Discord: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


LMArena Discord


Cursor Community Discord


LM Studio Discord


Nous Research AI Discord


Eleuther Discord


HuggingFace Discord


GPU MODE Discord


Torchtune Discord


Yannick Kilcher Discord


Notebook LM Discord


LLM Agents (Berkeley MOOC) Discord


MCP (Glama) Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Manus.im Discord Discord


Modular (Mojo 🔥) Discord


DSPy Discord


Nomic.ai (GPT4All) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1057 messages🔥🔥🔥):

Token limits, Context Window Discussion, RAG Models, Grok 4 Model, Comet Browser Features and Issues


Perplexity AI ▷ #sharing (3 messages):

Perplexity AI spaces, garbage collection


Perplexity AI ▷ #pplx-api (2 messages):

API Search, Web UI Search, search_domain_filters parameter


Unsloth AI (Daniel Han) ▷ #general (526 messages🔥🔥🔥):

Kimi K2 performance, LLM VRAM Calculator, Synthetic Datasets, GGUF vision support, Huawei chips


Unsloth AI (Daniel Han) ▷ #off-topic (9 messages🔥):

Latent Space Voice Encoding, Coding Pre-Training Models


Unsloth AI (Daniel Han) ▷ #help (92 messages🔥🔥):

Consumer GPU recommendations for fine-tuning, Multi-GPU setup for 70B LLMs, Unsloth VLM deployment options, GGUF quantization differences, VLLM cache directory


Unsloth AI (Daniel Han) ▷ #research (7 messages):

3D spatial representations for image understanding, Limitations of current models in spatial reasoning, Depth estimation research, New benchmark


Unsloth AI (Daniel Han) ▷ #unsloth-bot (54 messages🔥):

GGUF file download issues, Unsloth framework, LoRA finetuned models deployment, LoRA training for vLLM, Unsloth compatibility with PyTorch


OpenAI ▷ #annnouncements (1 messages):

Chain of Thought (CoT) Monitoring, Future AI Systems Oversight, Research Paper on CoT Monitoring


OpenAI ▷ #ai-discussions (543 messages🔥🔥🔥):

Gemini vs GPT vs Grok, Midjourney 'plagiarism' accusations, AI surveillance on Discord, N8N for AI agent building, AI's role in education


OpenAI ▷ #gpt-4-discussions (51 messages🔥):

GPT-4.1 latency variation, Discord bot performance issues, Operator issues, AI coding libraries


OpenAI ▷ #prompt-engineering (1 messages):

Cross-Model Validation, Declarative Prompts, Zero Shot Prompting


OpenAI ▷ #api-discussions (1 messages):

Declarative Prompts, Cross-Model Validation, Zero-Shot Prompting


LMArena ▷ #general (504 messages🔥🔥🔥):

Grok 4 Performance, Kimi Model, OpenAI Model Retraining, New Models in Arena, Style Control Impact


Cursor Community ▷ #general (430 messages🔥🔥🔥):

Microsoft Extensions forks, New pricing, Grok 4 issues, Kimi K2, Kiro Features


Cursor Community ▷ #background-agents (15 messages🔥):

Background Agent Context Loss, Bugbot Organization Repo Visibility, Web Agent Opening Issues, Secrets in Background Agents, Background Agent Costs


LM Studio ▷ #general (58 messages🔥🔥):

Change Download Directory, CrewAI Tutorial, Gemma 3 12b Vision Capability, Model Recommendations, Artifacting RX580


LM Studio ▷ #hardware-discussion (49 messages🔥):

Local Models vs API Models, LG's EXAONE 4 Licensing, AMD395 Mini-PC, M4 MBP vs AI 395 Platform, ROCm vs MLX Support


Nous Research AI ▷ #announcements (1 messages):

Hermes-3-Dataset


Nous Research AI ▷ #general (97 messages🔥🔥):

GPU for AI Training, Meta's Open Source Shift, Windsurf Gutted, Kimi K2 Hype, Quantizing Models


Nous Research AI ▷ #ask-about-llms (1 messages):

Fine-tuning Multimodal Models on Text, Mistral 3, Gemma 3, Qwen 3, ForCausalLM


Nous Research AI ▷ #research-papers (1 messages):

ee.dd: https://arxiv.org/abs/2507.08794


Nous Research AI ▷ #research-papers (1 messages):

ee.dd: https://arxiv.org/abs/2507.08794


Eleuther ▷ #general (52 messages🔥):

LLM architecture, Formal Languages and Neural Nets, R1-Zero, Voxtral Mini, hyperstition


Eleuther ▷ #research (12 messages🔥):

arXiv Endorsement Request, Image Captioning Framework, Relevance of Image Captioning Paper, Attention Mechanisms vs. Other Architectures, Encoder-Decoder Pipeline Analysis


Eleuther ▷ #scaling-laws (22 messages🔥):

Anthropic's inference costs, Deterministic ML vs. Stochastic, Diffusion language models, K2 Model, Groq's leading indicator


Eleuther ▷ #interpretability-general (1 messages):

burnytech: https://fxtwitter.com/HThasarathan/status/1944947772119245210


Eleuther ▷ #lm-thunderdome (5 messages):

``get-answer filter, Regex filter implementation, Filter pipeline names


HuggingFace ▷ #general (70 messages🔥🔥):

Desktop App for Voice-Controlled Task Automation, Codebase Organization Strategies, Dataset Endpoint Issues, GPU Access on Cloud Providers, Text-to-Text Models Tagging


HuggingFace ▷ #today-im-learning (4 messages):

4 bit training


HuggingFace ▷ #cool-finds (1 messages):

geekboyboss: https://github.com/cactus-compute/cactus


HuggingFace ▷ #i-made-this (6 messages):

Hypernetworks for multi-candidate problems, PandasAI and Datatune, Math-focused LLMs, BERT-Diffusion Architecture


HuggingFace ▷ #reading-group (1 messages):

Video+Image Understanding, 3D Pixel Representations, VQA Performance Boost


HuggingFace ▷ #computer-vision (1 messages):

dlp1843: Is the landing page to opencv.org to opencv what bitcoin.com is to bitcoin?


HuggingFace ▷ #agents-course (1 messages):

Inference Providers, Qwen Models, Llama/DeepSeek Models


GPU MODE ▷ #general (19 messages🔥):

NCU profiling on VM GPU, Parallel Radix Sort on GPU, PMPP Radix Sort Tutorial


GPU MODE ▷ #cuda (1 messages):

Predicate Registers, SASS Compiler Optimization


GPU MODE ▷ #torch (3 messages):

torch.compile hanging, TORCH_COMPILE_DEBUG, coordinate_descent_tuning


GPU MODE ▷ #algorithms (2 messages):

Parallel Radix Sort, Signed Integers


GPU MODE ▷ #jobs (4 messages):

Job opportunities, WFH positions, Voltage Park Careers


GPU MODE ▷ #beginner (3 messages):

Cloud GPUs, Vast.ai, nsight compute


GPU MODE ▷ #rocm (1 messages):

Kernel Tracing, HIP Tracing, HSA Tracing


GPU MODE ▷ #webgpu (10 messages🔥):

MTLReadWriteTextureTier2, WGPU rgba8unorm, dawn code, matrix-chat


GPU MODE ▷ #self-promotion (1 messages):

AMD GPU, Containers, Fractional GPUs


GPU MODE ▷ #submissions (1 messages):

A100, Leaderboard, trimul benchmark


GPU MODE ▷ #status (2 messages):

Triton reference for grayscale, GPUMODE kernelbot data on HuggingFace


GPU MODE ▷ #factorio-learning-env (4 messages):

Phone Stolen, Inactivity, Back-up


GPU MODE ▷ #cutlass (2 messages):

CuTeDSL, ARM Structure, CUTLASS, CUDA kernels


GPU MODE ▷ #singularity-systems (1 messages):

Project Introductions, Giving Talks


Torchtune ▷ #announcements (1 messages):

Torchtune project, Future of Torchtune, GitHub issue, Discord and Github support


Torchtune ▷ #general (51 messages🔥):

Torchtune library components for new projects, RL and future of finetuners, Quantum SVM, Ohio headquarters


Torchtune ▷ #dev (2 messages):

Optimizer Compilation


Yannick Kilcher ▷ #general (25 messages🔥):

ML Experiment Trackers, Grafana Large Log Solution, Claude 3 Sonnet, Circuit Tracing, Meta Open Source Betrayal


Yannick Kilcher ▷ #paper-discussion (1 messages):

.wavefunction: <@&1045297948034072678> , no discussion from me tonight.


Yannick Kilcher ▷ #ml-news (3 messages):

Yann LeCun, Signulll Sad Post


Notebook LM ▷ #announcements (1 messages):

Research Opportunity, User Interviews, Feedback


Notebook LM ▷ #use-cases (5 messages):

NotebookLM, Google Docs, News articles, Analysis notebook


Notebook LM ▷ #general (21 messages🔥):

Notebook limits, Dynamic updates, Audio Overviews, Math/Latex rendering, Pro plan price reduction


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (15 messages🔥):

Multiple Track Registration, Certificate Delivery Issues, Certificate Declaration Form


MCP (Glama) ▷ #general (15 messages🔥):

MCP Server Validation, Open Source LLM Client, Anthropic Connectors Directory


LlamaIndex ▷ #announcements (1 messages):

LlamaIndex Meetup in Amsterdam, Office Hours in Discord, NotebookLlaMa - A NotebookLM clone, Context Engineering techniques, Research agent with LlamaIndex & Gemini 2.5 pro


LlamaIndex ▷ #blog (4 messages):

Research Agent, Google Gemini 2.5 Pro, LlamaIndex workflows, Pydantic models, Snowflake partnership


LlamaIndex ▷ #general (8 messages🔥):

AI Agent Design, LlamaHub Tools, ML Logs storage, AI Showcase Virtual Conf


tinygrad (George Hotz) ▷ #general (4 messages):

Reinforcement Learning for Model Search, Recursive setitem in tensor.py, Large Kernels


tinygrad (George Hotz) ▷ #learn-tinygrad (6 messages):

Memory Allocation Overhead in Tinygrad, GlobalCounters.global_mem vs GlobalCounters.mem_used, Subbuffers and Memory Management in Tinygrad


Manus.im Discord ▷ #general (9 messages🔥):

Manus Fellowship, Scammer Alert, Automating sustainability, ESG Research Workflows, Manus Premium Feature


Modular (Mojo 🔥) ▷ #general (3 messages):

Discord bug, @kap command


Modular (Mojo 🔥) ▷ #mojo (4 messages):

mojo @parameter decorator, capturing keyword, github issue 5020


DSPy ▷ #general (4 messages):

AWS Prompt Optimizer, Nova Models, MIPRO Usage, Enterprise DSPy Wrappers


Nomic.ai (GPT4All) ▷ #general (2 messages):

GPT4ALL and Raspberry Pi 5, Dataset download error