Frozen AI News archive

Gemini 3.0 Flash Preview: 1/4 cost of Pro, but ~as smart, retakes Pareto Frontier

**Google** launched **Gemini 3 Flash**, a pro-grade reasoning model with flash latency, supporting tool calling and multimodal IO, available via multiple platforms including Google AI Studio and Vertex AI. It offers competitive pricing at $0.50 per 1M input tokens and $3.00 per 1M output tokens, with context windows up to 1M tokens. Benchmarks show **Gemini 3 Flash** rivals or outperforms larger models like **GPT-5.2** and **Gemini 3 Pro** in agentic, coding, and reasoning tasks, validated by ARC-AGI-2, SWE-bench, LMArena, and Arena benchmarks. Despite some tradeoffs like high token use and hallucination rates, it is cost-effective overall. Key figures include **Sundar Pichai**, **Jeff Dean**, and **Demis Hassabis** who publicly celebrated this achievement. The model's tool calling capabilities were demonstrated with 100 tools in a live demo.

Canonical issue URL

Gemini is all you need.

AI News for 12/16/2025-12/17/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (207 channels, and 8313 messages) for you. Estimated reading time saved (at 200wpm): 594 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

When we first started pushing the LLM Pareto frontier a year ago, and then it was picked up by Jeff Dean and Demis Hassabis, it wasn't long before Gemini 2.5 conquered it, before GPT-5 then claimed it 4 months after. Now we are back to Gemini 3.0 claiming it, again witih Sundar and Jeff loudly trumpeting this accomplishment:

A performance comparison chart of Gemini AI models showing their benchmarks and positioning across different metrics.

Apart from Arenas, this is also validated in academic benchmarks:

A performance comparison chart of AI models across various benchmarks, highlighting Gemini 3 Flash's competitive performance against larger models like Gemini

and ARC AGI has its own chart showing efficiency:

A performance comparison chart of AI models across various benchmarks, highlighting Gemini 3 Flash's competitive positioning against other models like GPT-

Here are some specific breakdown highlights:

A detailed performance comparison table of AI models across various benchmarks, highlighting Gemini 3 Flash's competitive performance against larger models like Gem

Apart from the disillation, the focus here seems to be tool calling. Here is a demo showing 100 tools and more demos from Addy Osmani.


AI Twitter Recap

Gemini 3 Flash launch: frontier intelligence at flash latency (ecosystem, metrics, caveats)

Voice AI and embodied assistants

Training efficiency and MoE systems

Interactive world models, video, and 3D assets

Retrieval, evaluation, and multi‑vector search

Infra and ops for agents

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. 3D Model Generation from Single Image

2. Long-Context AI Model Innovations

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Gemini 3 Flash vs Pro Performance and Benchmarks

2. AI Model Comparisons and Realism Tests

3. AI User Experience and Critiques


AI Discord Recap

A summary of Summaries of Summaries by gpt-5.2

1. Gemini 3 Flash Rollout & Model Shootouts

2. Cost, Pricing Bugs, and the “LLM Tax” Reality

3. Tooling & Standards: MCP Everywhere, Plus a New Completions Spec

4. GPUs, Kernels, and Where the Compute Actually Comes From

5. Training & Data Workflows: From Unsloth CLI to OCR Data Moats


Discord: High level Discord summaries

LMArena Discord


Unsloth AI (Daniel Han) Discord


BASI Jailbreaking Discord


Cursor Community Discord


Perplexity AI Discord


OpenAI Discord


LM Studio Discord


OpenRouter Discord


GPU MODE Discord


Latent Space Discord


Nous Research AI Discord


HuggingFace Discord


Eleuther Discord


Yannick Kilcher Discord


DSPy Discord


Manus.im Discord Discord


Modular (Mojo 🔥) Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


Moonshot AI (Kimi K-2) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MCP Contributors (Official) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1200 messages🔥🔥🔥):

Gemini 3 Flash, GPT-5.2, Hallucination benchmark, AMD vs Nvidia, Prompt filter lmarena.ai


LMArena ▷ #announcements (2 messages):

Text Arena Leaderboard, GPT-5.2-high, Gemini-3-flash, Vision Arena Leaderboard, WebDev Arena Leaderboard


Unsloth AI (Daniel Han) ▷ #general (736 messages🔥🔥🔥):

Unsloth CLI tool, Colab H100, GRPO memory issues, GGUF Model Update, Training on phones


Unsloth AI (Daniel Han) ▷ #off-topic (524 messages🔥🔥🔥):

Self-Promotion in Discord, Marketing Strategies for AI Services, Model Leaks and Branding, Logitech MX3S Mouse Review, Linux Distro Choice - Arch vs Ubuntu


Unsloth AI (Daniel Han) ▷ #help (60 messages🔥🔥):

Qwen2.5 VL 7B for OCR, Deepseek OCR vs Paddle OCR, Fine-tuning vs Continued Pre-training for OCR, Data Creation for Fine-tuning, Image Resolution and Qwen3 VL Coordinate System


Unsloth AI (Daniel Han) ▷ #showcase (15 messages🔥):

Model Training Dashboard, UX Improvements, funsloth Claude Skill, LLMs as Judges, Progressive Disclosure


Unsloth AI (Daniel Han) ▷ #research (9 messages🔥):

Drag-and-Drop LLMs, Mola, Meta's audiovisual perception paper


BASI Jailbreaking ▷ #general (656 messages🔥🔥🔥):

Jailbreak tax, duck.ai image generator, Deepseek prompts, Indirect Syscall, GPTs agent training


BASI Jailbreaking ▷ #jailbreaking (605 messages🔥🔥🔥):

jailbreak for grok or claude, DAN 6.0 prompt, memory and role-play movie/episode scripts, recreation of jailbreaks, Pliny's tokenbomb, jailbreak for Claude


BASI Jailbreaking ▷ #redteaming (9 messages🔥):

GeminiJack challenge, Redteaming new ChatGPT images, Gemini v3 safety prompt guidelines, Red teaming entry


Cursor Community ▷ #general (862 messages🔥🔥🔥):

Cursor Editor mode, Opus costs, AI-generated websites, Cursor memory leak, BugBot plan limit


Perplexity AI ▷ #general (693 messages🔥🔥🔥):

GPT-5 Pro, Claude Opus API, Max Plan, GPT 5.2 Pro, Extended Thinking Modes


OpenAI ▷ #ai-discussions (433 messages🔥🔥🔥):

Nano Banana Image Generation, GPT-5.2 Performance, Gemini-Flash-3-Image, AI Hallucinations


OpenAI ▷ #gpt-4-discussions (9 messages🔥):

GPT-image-1.5 model, GPT-5-mini costs, ChatGPT PRO high-res option


LM Studio ▷ #general (147 messages🔥🔥):

Model quality perceptions, Benchmarking, Model Recommendations, Quantization levels, LM Studio Plugins for Web Search


LM Studio ▷ #hardware-discussion (246 messages🔥🔥):

Pro 6000 price increase, Zotac 3090 availability and pricing, 4080 32GB vs 3090 Ti for AI, Obsidian setup and sync, AMD Ryzen AI Max+ 395 mini PC for AI


OpenRouter ▷ #announcements (1 messages):

Gemini 3 Flash, OpenRouter, Model Comparison


OpenRouter ▷ #general (128 messages🔥🔥):

Xiaomi mimo v2, Free models to test with tooling, Agent Architecture Routing, Gemini 3 Flash not working, Timeout Errors


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (165 messages🔥🔥):

Anthropic Compatible API, OpenCompletions RFC, CC Sonnet and Haiku calls, Claude Models' Self Confidence, LLM Minecraft Experiments


GPU MODE ▷ #general (28 messages🔥):

RTX PRO 5000 Blackwell specs, GPU programming career advice, Identity theft targeting ML devs, GPU programming model from the graphics perspective, TMA reduce operation


GPU MODE ▷ #cuda (12 messages🔥):

cuTile vs Triton, GEMM Flops on Blackwell, __pipeline_memcpy_async implementation, CPU differences for B200, DSMEM practical benefits


GPU MODE ▷ #announcements (1 messages):

NVIDIA, cuTile, TileIR, Mehdi Amini, Jared Roesch


GPU MODE ▷ #cool-links (4 messages):

NVIDIA psy-op, Fake elapsed timing, LLMs figuring it out


GPU MODE ▷ #beginner (1 messages):

marksaroufim: Yeah just help them out haha


GPU MODE ▷ #off-topic (2 messages):

Generative AI and Robotics, ROS 2


GPU MODE ▷ #rocm (22 messages🔥):

AMDGPU crashes, ROCm Runtime issues, HIPSPARSELt Availability, NPS Partitioning, RDNA3 Server Hangs


GPU MODE ▷ #self-promotion (6 messages):

Cloud GPUs, MathDx, Julia


GPU MODE ▷ #thunderkittens (1 messages):

kashimoo2_76983: <@1012256135761383465> did you folks write a decode kernel with mi300s or 355s?


GPU MODE ▷ #submissions (19 messages🔥):

NVIDIA leaderboard, histogram_v2 leaderboard, grayscale_v2 leaderboard


GPU MODE ▷ #hardware (30 messages🔥):

MI250 vs MI250X, MI250/MI250X Node, FP8 support, Mining Primes


GPU MODE ▷ #cutlass (1 messages):

drazi1983: We need to update document: we support 3.10 to 3.13 ( double checking 3.14 )


GPU MODE ▷ #nvidia-competition (51 messages🔥):

Cluster Bot Errors, Github Token Rate Limits, CUDA Graph Cheating, NVFP4 GEMM Help, TMEM Bandwidth


GPU MODE ▷ #robotics-vla (15 messages🔥):

Ego centric research, Robot data pretraining benefits, Hand pose estimation, Household data collection dream


GPU MODE ▷ #career-advice (27 messages🔥):

Entry Level Job Search, AI Infra Engineer Demand, HPC Entry Level Challenges, Upskilling Strategies, Community Involvement


Latent Space ▷ #ai-general-chat (68 messages🔥🔥):

Warp Agents, Claude Plugins Marketplace, ChatGPT Image Generation 1.5, OpenAI Fundraising with AWS, AI Agents Controlling Native Android Apps


Latent Space ▷ #private-agents (4 messages):

Google Labs, AI Agent, Gmail Integration


Latent Space ▷ #genmedia-creative-ai (28 messages🔥):

Microsoft TRELLIS 2, UltraFlux VAE, AI Renovation Videos, Voice AI Nuance, Hunyuan 3D 3.0


Nous Research AI ▷ #general (64 messages🔥🔥):

Nous Research tests creative model vs mistral, Fairness of comparing 70B to 24B models, GPT-5.2 robotic templates, Gemini 3 Flash release, LLM writing progress stagnates


Nous Research AI ▷ #ask-about-llms (3 messages):

Handwritten notes to markdown, Deepseek Chandra for OCR


Nous Research AI ▷ #research-papers (1 messages):

Drag-and-Drop LLMs paper


Nous Research AI ▷ #research-papers (1 messages):

Drag-and-Drop LLMs paper


HuggingFace ▷ #general (57 messages🔥🔥):

TTS model benchmarking with lighteval, RLHF positive reward without human feedback, Stopping model training after a set time, Siamese Neural Network achievement, Filtering Spaces with errors


HuggingFace ▷ #i-made-this (2 messages):

FRACTAL-1-3B, Constraint-based protein structure prediction, Android voice assistant


HuggingFace ▷ #gradio-announcements (1 messages):

MCP Hackathon Winners, Gradio Community, AI Creativity


HuggingFace ▷ #agents-course (8 messages🔥):

Debugging Vector Database, AI Agent Study Group, AI/ML beginner courses


Eleuther ▷ #general (27 messages🔥):

Common Crawl Foundation, NSF SBIR proposal, Anubis (Proof of Work) captcha, Deepfake detection and vision–language models, GPT-2 interpretability


Eleuther ▷ #research (4 messages):

VAE viability, Conference paper strategies


Eleuther ▷ #scaling-laws (1 messages):

Saturation in heterogeneous difficulty, Power-law behavior, Internal regulation, Multi-timescale dynamics, Emergence


Eleuther ▷ #interpretability-general (14 messages🔥):

AI decision state and memory inspection, Nanda's view on Mechanical Interpretability, SAE practical value for big companies, Rakuten SAE probes for PII detection, Anthropic's selective gradient masking


Eleuther ▷ #multimodal-general (1 messages):

aeros93: https://arxiv.org/abs/2512.10685


Yannick Kilcher ▷ #general (19 messages🔥):

GPU rental experiences, Gen-AI use cases in admin/IT, In-context learning research


Yannick Kilcher ▷ #ml-news (10 messages🔥):

Noise Isolation, Mistral Small Creative, Debugging AMD GPU, Announcing XINT Code, Gemini 3 Flash


DSPy ▷ #general (26 messages🔥):

MIPROv2 vs GEPA, Benchmarking LLMs for medical tasks, Gemini-3-Flash released, AIMO3 with DSPy, Programs that contain several prompts or several LLM calls


Manus.im Discord ▷ #general (18 messages🔥):

France Meetup, Manus 1.6 Max Discount, Developer Availability, DNS Issue


Modular (Mojo 🔥) ▷ #general (10 messages🔥):

BlockseBlock sponsorship, GPU functions


Modular (Mojo 🔥) ▷ #mojo (3 messages):

GPU issue in graph library, API regression in Mojo, Build LLM in MAX from scratch


aider (Paul Gauthier) ▷ #general (2 messages):

AI System Design Principles, Deterministic vs Probabilistic AI, Model Observability and Replaceability


aider (Paul Gauthier) ▷ #questions-and-tips (4 messages):

MCP Servers in Aider, Token Usage with Qwen3-coder-30b, IDE Index MCP Server


tinygrad (George Hotz) ▷ #general (5 messages):

Bounty questions, Smart questions html, Device CPU


Moonshot AI (Kimi K-2) ▷ #general-chat (3 messages):

Kimi K2, DigitalOcean Article