Frozen AI News archive

Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview

**Gemini 2.5** models are now generally available, including the new **Gemini 2.5 Flash-Lite**, **Flash**, **Pro**, and **Ultra** variants, featuring sparse **Mixture-of-Experts (MoE)** transformers with native multimodal support. A detailed 30-page tech report highlights impressive long-horizon planning demonstrated by **Gemini Plays Pokemon**. The **LiveCodeBench-Pro** benchmark reveals frontier LLMs struggle with hard coding problems, while **Moonshot AI** open-sourced **Kimi-Dev-72B**, achieving state-of-the-art results on **SWE-bench Verified**. Smaller specialized models like **Nanonets-OCR-s**, **II-Medical-8B-1706**, and **Jan-nano** show competitive performance, emphasizing that bigger models are not always better. **DeepSeek-r1** ties for #1 in WebDev Arena, and **MiniMax-M1** sets new standards in long-context reasoning. **Kling AI** demonstrated video generation capabilities.

Canonical issue URL

Gemini gemini gemini. Readers might also enjoy our Karpathy @ Startup School recap.

AI News for 6/16/2025-6/17/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (219 channels, and 6626 messages) for you. Estimated reading time saved (at 200wpm): 547 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

As previewed multiple times in the past 3 months leading up to Google I/O and at the AIE World's Fair, Gemini Product Lead Tulsee Doshi finally announced that the 2.5 models are now generally available (aka with no "preview" or date tag).

Gemini 2.5 also now comes with a 30 page tech report with some notable details on evals, and a teeny tiny bit on architecture:

2.5 Flash Lite, the cheap/fast model, is now in preview with Oriol Vinyals emphasizing the simulative possibilities of >400 tok/s.


AI Twitter Recap

Model Releases, Benchmarks, and Performance

AI Agents, Tooling, and Frameworks

Infrastructure, Hardware, and Efficiency

Research and New Techniques

Industry News, Commentary, and Geopolitics

Humor/Memes


AI Reddit Recap

/r/LocalLlama Recap

1. Local/Open-Source LLM Rigs and Daily Usage

2. AI Model Strategy and Future Plans (Qwen3 and MoE)

3. Comprehensive Tutorials for Building AI Agents

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Advances and Benchmarks in AI Model Releases (Gemini, EG-CFG, OpenAI o3)

2. Innovative Workflows and Tools for AI-Based Image/Video Generation (Flux, ComfyUI, WAN)

3. Reflections on ChatGPT/Claude as Companions, Therapists, and Reality Advisors


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1. The Model Gauntlet: New Releases, Performance Showdowns, and Versioning Dramas

Theme 2. Powering the Prompts: Frameworks, Libraries, and Platforms Vie for Developer Hearts

Theme 3. The Rise of the Agents: MCP Ecosystem Matures and New Tools Emerge

Theme 4. Silicon & Kernels: Hardware Battles and GPU Optimization Frontiers

Theme 5. AI in the Wild: Creative Sparks, Community Buzz, and Platform Puzzles


Discord: High level Discord summaries

Cursor Community Discord


Perplexity AI Discord


LMArena Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


Eleuther Discord


HuggingFace Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Latent Space Discord


aider (Paul Gauthier) Discord


GPU MODE Discord


Yannick Kilcher Discord


Notebook LM Discord


Torchtune Discord


Nous Research AI Discord


Modular (Mojo 🔥) Discord


tinygrad (George Hotz) Discord


Manus.im Discord Discord


MCP (Glama) Discord


LlamaIndex Discord


DSPy Discord


Cohere Discord


Nomic.ai (GPT4All) Discord


LLM Agents (Berkeley MOOC) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Cursor Community ▷ #general (971 messages🔥🔥🔥):

Gemini 2.5 Pro model, 500 fast requests plan, OpenAI platform sidebar changes, Model Merging, Indexing Bug


Cursor Community ▷ #background-agents (22 messages🔥):

Background Agents setup, Slack Auth issues, Github integration issues, Snapshot sharing issues, Cursor version issues


Perplexity AI ▷ #general (1097 messages🔥🔥🔥):

Perplexity Memory, GPT Image Generation Limits, Decoding Challenge, Gemini 2.5 Pro, Cult vs Religion


Perplexity AI ▷ #sharing (1 messages):

i_795: https://www.perplexity.ai/page/beloved-food-network-chef-dies-32re9eILQi6BSMSflRDJzg


Perplexity AI ▷ #pplx-api (5 messages):

AI Project for web search, Perplexity AI, AI startup, Discord bot with sonar api


LMArena ▷ #general (1145 messages🔥🔥🔥):

Blacktooth vs Kingfall, GPTs Agents training, Google Gemini Versioning Issues, Veo 3 video generator, Gemini 2.5 Pro


Unsloth AI (Daniel Han) ▷ #general (258 messages🔥🔥):

Red dots LLM GGUF, GRPO reward model, Gemma 3 12B conversion, Kimi-Dev-72B-GGUF, legal research AI model


Unsloth AI (Daniel Han) ▷ #off-topic (15 messages🔥):

AI model for determining person placement based on images, Discord server question


Unsloth AI (Daniel Han) ▷ #help (152 messages🔥🔥):

Llama 3 finetuning tips for multi-role prompts, OOM Error on saving LoRA finetune, Jira issue finetuning with LLaMA 3.2B, Merging QLoRA finetuned LLaMA 8B with base model, Qwen 2.5 VL 7b Vision Model error


Unsloth AI (Daniel Han) ▷ #research (4 messages):

MoE embedding model, GritLM-8x7B


OpenAI ▷ #annnouncements (1 messages):

ChatGPT, Image generation, WhatsApp


OpenAI ▷ #ai-discussions (255 messages🔥🔥):

Gemini 2.5 Pro, Midjourney vs Sora, Imagen 4, AI startup, Codex CLI


OpenAI ▷ #gpt-4-discussions (18 messages🔥):

Electron app performance limits, ChatGPT voice transcription issues, Advanced voice mode usage tracker, Dictate transcribing in the wrong language


OpenAI ▷ #prompt-engineering (51 messages🔥):

Recursive Epistemic Integrity Field, NotebookLLM for large files, ChatGPT limited context workaround, Maintaining context with multiple GPTs


OpenAI ▷ #api-discussions (51 messages🔥):

AI Recursive Epistemic Integrity Field, Simulated AI death, ChatGPT file reading, NotebookLLM pdf analysis, GPT prompting


Eleuther ▷ #announcements (1 messages):

Speaker series, LLM Tokenizers, Embedding glitches


Eleuther ▷ #general (55 messages🔥🔥):

EleutherAI Discord, Torch Compile Model, Loss Curve Expectation, Min-P Sampling, Group Calls


Eleuther ▷ #research (276 messages🔥🔥):

RWKV7 Training, Avey Block, Linear Attention and Normalization, LLM Image Generation


Eleuther ▷ #lm-thunderdome (21 messages🔥):

simple_evaluate() customization, HF_DATASETS_CACHE workaround, TaskManager for Task Configuration


HuggingFace ▷ #general (324 messages🔥🔥):

Qwen 3, 5090 vs H100, Youtube automation for $$$, Multilingual Models, DeepSpeed Universal Checkpoints


HuggingFace ▷ #today-im-learning (5 messages):

HF AI Agents Fundamental Course, Chatbot project using generative AI


HuggingFace ▷ #i-made-this (4 messages):

gary4beatbox gets a new buddy, Dataseeds dataset trending, Chromium extension to speak to any readme


HuggingFace ▷ #reading-group (1 messages):

chad_in_the_house: Awesome! Looks very cool. If I can get a date/time I can setup an event


HuggingFace ▷ #computer-vision (1 messages):

computer vision mentorship, CV engineer career path


HuggingFace ▷ #NLP (1 messages):

cakiki: <@338622066620104704> Please don't cross-post, and keep channels on topic.


HuggingFace ▷ #gradio-announcements (2 messages):

Gradio Agents, MCP Hackathon Winners, Custom Component Track, Special Awards, Innovative Use of MCP


HuggingFace ▷ #smol-course (3 messages):

Gemma 14b, CPU Offloading, Course Length


HuggingFace ▷ #agents-course (5 messages):

403 Errors, Rate Limit Errors, MCP Server Prompts, Smol Agents, Ollama Server


OpenRouter (Alex Atallah) ▷ #announcements (14 messages🔥):

Gemini 2.5 Pro, Gemini Flash, Model Renaming, Pricing Updates


OpenRouter (Alex Atallah) ▷ #general (262 messages🔥🔥):

Gemini 2.5, New Pricing, BYOK, Key Credit Balance


LM Studio ▷ #general (84 messages🔥🔥):

custom stop token for LM studio's chat, LM Studio model directory, Ollama models, Open WebUI, Model Context Protocol


LM Studio ▷ #hardware-discussion (56 messages🔥🔥):

Cloud GPU rental, Multi-GPU setup, RunPod vs AWS/GCP, used 3090, Supermicro boards


Latent Space ▷ #ai-general-chat (114 messages🔥🔥):

OpenAI MCP Support in ChatGPT, MiniMax AI Capabilities, OpenAI Microsoft Tensions, Extend Funding for Document Processing, Gemini 2.5


Latent Space ▷ #ai-announcements (4 messages):

Andrej Karpathy AI Talk, Software 3.0, LLM analogies, LLM Psychology, Partial Autonomy


aider (Paul Gauthier) ▷ #general (77 messages🔥🔥):

Role Prompting, Aider Agents, o1-pro with Aider, Codex Mini, Gemini 2.5 Pro


aider (Paul Gauthier) ▷ #questions-and-tips (5 messages):

Qwen3 30b Moe, VRAM optimization, Selective parameter loading, MoE layer selection


GPU MODE ▷ #general (15 messages🔥):

Groq architecture, DeepSpeed Stage 3 conversion, Groq and HBM absence, Model inference optimizations on GPUs, Model sizing and memory management


GPU MODE ▷ #triton (1 messages):

leetgpu problems, reduction kernels, pointwise kernels, cuda limitations


GPU MODE ▷ #cuda (1 messages):

cpdurham: I have some guesses but does anyone know why tensor cores are TN?


GPU MODE ▷ #torch (3 messages):

TorchTitan, aot_module, Faketensor, GPU utilization, CPU utilization


GPU MODE ▷ #announcements (1 messages):

Dr. Lisa Su shouts out GPU MODE, GPU MODE's humble reading group morphed into a machine


GPU MODE ▷ #beginner (7 messages):

sqrt in distance calculations, wgsl builtin, dot product as an alternative, speculative decoding inference speed, cuTensor map for fp8


GPU MODE ▷ #youtube-recordings (1 messages):

debadev: hi


GPU MODE ▷ #off-topic (1 messages):

majoris_astrium: Congrats GPU mode


GPU MODE ▷ #irl-meetup (5 messages):

Euro Meetup, Paris Meetup, Kernel Optimization, Public Transport vs Uber, Metro trains


GPU MODE ▷ #rocm (15 messages🔥):

MI300A, MI300X, IOD, Infinity Cache, HBM Stacks


GPU MODE ▷ #liger-kernel (1 messages):

Liger Kernel, Debugging Liger Kernel


GPU MODE ▷ #self-promotion (1 messages):

Decentralized Training, Google Meet link


GPU MODE ▷ #thunderkittens (3 messages):

Async Instructions, Shared Memory Limitations, TK to AMD Port, Variable Length Attention Kernels, 4090 TARGET Compilation


GPU MODE ▷ #reasoning-gym (1 messages):

Chain-of-thought reasoning, CoT on Math and Symbolic Reasoning


GPU MODE ▷ #submissions (2 messages):

VectorAdd Leaderboard, L4 benchmark


GPU MODE ▷ #factorio-learning-env (2 messages):

LLM failure modes, LLMs play Pokemon


GPU MODE ▷ #amd-competition (2 messages):

Open Source Project, GitHub Star


GPU MODE ▷ #cutlass (3 messages):

Mercury API Access, AlphaEvolve Workflow, GEMM Kernels


Yannick Kilcher ▷ #general (32 messages🔥):

Model collapse fearmongering, Lex Fridman interview, Terrence Tao interview, DeepSpeed stage 3 checkpoint conversion, Technical YouTube show idea


Yannick Kilcher ▷ #paper-discussion (4 messages):

Gemini v2.5 Technical Report, Predictive Coding


Yannick Kilcher ▷ #ml-news (12 messages🔥):

McKinsey GPT, Cost cutting automation, Sutton's tweet


Notebook LM ▷ #use-cases (7 messages):

NotebookLM capabilities, Website capturing and NBLM import, Podcast creation using NotebookLM, NotebookLM use cases in education


Notebook LM ▷ #general (28 messages🔥):

NotebookLM Access Issues, Podcast Language Adaptation, Notebook Sharing Difficulties, AI for Mechanical Engineering


Torchtune ▷ #general (3 messages):

Torchtune Logo, PyTorch Logo


Torchtune ▷ #dev (32 messages🔥):

Optimizer Design, Muon Optimizer, Packed Batches, Flex Attention, Mistral Tokenizer


Nous Research AI ▷ #general (20 messages🔥):

Kimi-Dev-72B coding LLM, Hyperparameter Determination for Large Models, Custom MCP Tools, Lessons learned dealing with LLMs, Gemini 2.5


Nous Research AI ▷ #ask-about-llms (2 messages):

Reasoning Model Eval Sets, Workload suggestions


Nous Research AI ▷ #research-papers (2 messages):

Real Azure, TNG Technology, Jxmnop X posts


Nous Research AI ▷ #interesting-links (4 messages):

LLM Memory, Resurrection Prompts, Orchestration, Mode Dials, Checkpoints


Nous Research AI ▷ #research-papers (2 messages):

Real Azure, TNG Technology, Jxmnop


Modular (Mojo 🔥) ▷ #general (10 messages🔥):

AVX-512 and Ryzen 7000, Nova Lake, Llama-3.2-Vision-Instruct-unsloth-bnb/4B


Modular (Mojo 🔥) ▷ #mojo (12 messages🔥):

Mojo Open Source, Mojo vs Rust, Mojo Kernel OS, Mojo Classes, Mojo Dynamisity


tinygrad (George Hotz) ▷ #general (10 messages🔥):

tinyxxx GitHub stars, smart-questions FAQ


tinygrad (George Hotz) ▷ #learn-tinygrad (10 messages🔥):

Custom Optimizer, Tensor.assign, TinyJit and .realize(), State Dict Approach


Manus.im Discord ▷ #general (16 messages🔥):

Manus Edu Pass Wishlist, Claude 4 Update, Daily vs Monthly Credits, Webp to PNG conversion, High traffic on Manus


MCP (Glama) ▷ #general (8 messages🔥):

Docker MCP Catalog, FastMCP Custom Transport, Image display via MCP, AWS Lambda & Inspector


MCP (Glama) ▷ #showcase (5 messages):

MCP user analytics, Block's MCP playbook, Attendee MCP server, Spaces MCP server integration, Text-to-GraphQL MCP server


LlamaIndex ▷ #blog (3 messages):

AI Agents in Production SF event, Multi-Agent Financial Analysis System, Model Context Protocol (MCP) servers


LlamaIndex ▷ #general (8 messages🔥):

Vertex AI async streaming, ReActAgent generation


DSPy ▷ #show-and-tell (1 messages):

DSPy Optimization Patterns, DSPy Use Cases, DSPy Tooling


DSPy ▷ #general (9 messages🔥):

DSPy LM Usage Tracking, Optimize RAG Agents with DSPy, DSPy Tool Exception Handling


Cohere ▷ #🧵-general-thread (2 messages):

Cmd-R weights update, Open Weight Model Longevity


Cohere ▷ #👋-introduce-yourself (3 messages):

KGeN partnerships, Decentralized distribution protocol, Cohere Community Discord Server introductions


Nomic.ai (GPT4All) ▷ #general (3 messages):

New member introduction, PDF Question Answering


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (1 messages):

Sp25 MOOC quiz archive, Quizzes section