Frozen AI News archive

OpenAI GPT Image-1.5 claims to beat Nano Banana Pro, #1 across all Arenas, but completely fails Vibe Checks

**OpenAI** released its new image model **GPT Image 1.5**, featuring precise image editing, better instruction following, improved text and markdown rendering, and faster generation up to 4×. Despite topping multiple leaderboards like **LMArena (1277)**, **Design Arena (1344)**, and **AA Arena (1272)**, user feedback from Twitter, Reddit, and Discord communities is largely negative compared to **Nano Banana Pro** by **Gemini**. Xiaomi introduced the **MiMo-V2-Flash**, a **309B MoE** model optimized for inference efficiency with **256K context window**, achieving state-of-the-art scores on SWE-Bench. The model uses Hybrid Sliding Window Attention and multi-token prediction, offering significant speedups and efficiency improvements. The timing of OpenAI's launch amid competition from Gemini and Nano Banana Pro affects user sentiment, highlighting challenges in benchmarking relevance.

Canonical issue URL

a rare miss at a rough time.

AI News for 12/15/2025-12/16/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (207 channels, and 10501 messages) for you. Estimated reading time saved (at 200wpm): 734 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

The headline details of OpenAI's new image model are good - precise image editing, executing creative ideas, better instruction following, much better text and markdown rendering, fixing obvious bugs in the old gpt-image-1, and even voluntarily highlighting known regressions in the model. It even scores 1277 on LMArena and 1344 on Design Arena and 1272 on AA Arena, all #1 spots.

A bar graph showing OpenAI's GPT-Image-1.5 model ranking #1 on the Image Arena leaderboard with

BUT: the compliments stop there. Basically universally all the vibe checks from Twitter, Reddit, and the various Discord communities are negative in comparisons with Nano Banana Pro. The progress from GPT-Image-1 is clear enough, so this is not so much a knock on OpenAI overall, but more a rough showing for confidence in Arena benchmarking being representative of actual serious user preferences.

The context and timing matters for those who care about the blow by blows of the capability race. If they had shipped this before NBP, or there was no overhanging narrative of a "Code Red" in light of Gemini competition, Image-1.5 would've been a fine launch. Now the vibes are off.


AI Twitter Recap

Xiaomi’s MiMo‑V2‑Flash: 309B MoE built for speed, long context, and SWE‑Bench SOTA

Image generation shake‑up: OpenAI’s GPT Image 1.5 (“ChatGPT Images”) and FLUX.2 Max

Open‑source push from NVIDIA: Nemotron‑Cascade and broader availability of Nemotron 3

Benchmarks for factuality and science: FACTS and FrontierScience

Serving and agent infra: KV‑aware routing, P/D disaggregation, control planes

Multimodal/audio/3D: open releases and fast view synthesis

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Meta SAM Audio Model Launch

2. OpenAI Internal Discussions on AI Openness

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. OpenAI GPT-Image-1.5 Release and Benchmarks

2. Claude Code Updates and Applications

3. AI in Personal and Social Contexts


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. OpenAI Image 1.5 Launch & Model Matchups

2. OpenRouter’s New Models & Spec Push

3. Audio AI: Segment, Perceive, and Speak

4. Jailbreaks, RLHF, and Red-Team Gauntlets

5. Evals, Routing, and Agent Reality Checks


Discord: High level Discord summaries

BASI Jailbreaking Discord


LMArena Discord


Unsloth AI (Daniel Han) Discord


Cursor Community Discord


OpenAI Discord


Perplexity AI Discord


LM Studio Discord


OpenRouter Discord


HuggingFace Discord


GPU MODE Discord


Latent Space Discord


Nous Research AI Discord


Moonshot AI (Kimi K-2) Discord


Eleuther Discord


Manus.im Discord Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


DSPy Discord


MLOps @Chipro Discord


MCP Contributors (Official) Discord


The Modular (Mojo 🔥) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

BASI Jailbreaking ▷ #general (1262 messages🔥🔥🔥):

GPT-5 mini, ChatGPT System Message, jailbreak prompt


BASI Jailbreaking ▷ #jailbreaking (56 messages🔥🔥):

Claude 4.5 Jailbreak, RLHF builds model character, Prompt Injection, Jailbreaking Drones, DeepSeek Jailbreak


BASI Jailbreaking ▷ #redteaming (27 messages🔥):

Jailbreaking Resources for Beginners, Attempting to break GitHub repo, Red Teaming Advice, GeminiJack Styled Challenge


LMArena ▷ #general (975 messages🔥🔥🔥):

GPT Image 1.5, Image editing, Nano Banana Pro, Gemini 3, Model Performance


LMArena ▷ #announcements (3 messages):

YouTube channel launch, December AI Generation Contest, Image Leaderboard Update, New image models


Unsloth AI (Daniel Han) ▷ #general (787 messages🔥🔥🔥):

GRPO vs DPO, GLM models in Chinese, Nemotron vs Qwen, Unsloth GPU requirements, llama.cpp for windows


Unsloth AI (Daniel Han) ▷ #off-topic (499 messages🔥🔥🔥):

H100 on Colab, RunPod restrictions, Grad spikes and reward issues, Gemma models, SAM Audio License


Unsloth AI (Daniel Han) ▷ #help (82 messages🔥🔥):

GPT4ALL, Text Translation, XFormers and Unsloth, Qwen3 fine-tuning, Vision Language OCR


Unsloth AI (Daniel Han) ▷ #research (2 messages):

AudioVisual Perception, Large Scale Multimodal Correspondence Learning


Cursor Community ▷ #general (1039 messages🔥🔥🔥):

Cursor API timeouts, GPTs Agents, OpenAI's sidebars, Text Expander, Cursor Billing Issues


OpenAI ▷ #annnouncements (4 messages):

Branched Chats, FrontierScience Eval, ChatGPT Images


OpenAI ▷ #ai-discussions (698 messages🔥🔥🔥):

Gemini vs GPT image generation, Nano Banana Pro for image generation, Sora 2 access and limitations, Midjourney vs Nano Banana Pro


OpenAI ▷ #gpt-4-discussions (87 messages🔥🔥):

GPT-5.2 Issues, GPTs guardrails and safety, Blame shifting, GPTs follow-up questions, Adult Mode


Perplexity AI ▷ #general (845 messages🔥🔥🔥):

GPT-5.2 Pro vs Claude 4.5 Opus, Perplexity Pro limitations, Microsoft's small models, Perplexity image generation


LM Studio ▷ #general (72 messages🔥🔥):

Slow download finalization, Vision models not showing images, Nvidia Nemo 3 on LM Studio, GGUF vs non-GGUF models, LM Studio as Ollama server


LM Studio ▷ #hardware-discussion (167 messages🔥🔥):

Graphics card seating, Pro 6000 price increase, Zotac 3090 deals, 4080 32GB vs 3090 Ti, Obsidian setup and sync


OpenRouter ▷ #announcements (3 messages):

Xiaomi MiMo-V2-Flash, Mistral Small Creative, Black Forest Lab's FLUX.2 [max]


OpenRouter ▷ #general (111 messages🔥🔥):

Gemini API Usage, Daily Limit Upgrade, Long-Term Roleplay Models, Payment Declined, Baidu Model Evaluation


OpenRouter ▷ #new-models (4 messages):

``


OpenRouter ▷ #discussion (93 messages🔥🔥):

OpenRouter Minecraft Server, OpenRouter Labubu, Claude Code models, Standardized Completions/Responses, Normalized schema


HuggingFace ▷ #general (112 messages🔥🔥):

FSDP Upcast Warning, Vibe CAD Research, Microsoft VibeVoice, Fine-tuning LLMs for Summarization, Kiln.tech


HuggingFace ▷ #i-made-this (4 messages):

Confession as diagnostic method for LLMs, Zenflow live, Qwen 360 Diffusion release, Cognitive-Proxy steering LLMs


HuggingFace ▷ #gradio-announcements (3 messages):

MCP 1st Birthday Hackathon Winners, Hackathon Participation Certificates, Track 2 Winners


HuggingFace ▷ #agents-course (25 messages🔥):

Smol course offering, Deep reinforcement learning course, Box2D dependency issue, LLM and Langchain package versions, Vector database troubleshooting


GPU MODE ▷ #general (15 messages🔥):

Paper Reading Groups, RTX PRO 5000 Blackwell Specs, GPU Programming Career Advice, Scam Bot Targeting ML Devs


GPU MODE ▷ #cuda (4 messages):

cuTile Advantages, cuTile vs Triton, cuTile GEMM Flops on Blackwell


GPU MODE ▷ #cool-links (9 messages🔥):

TMEM's Dedicated Arbitration Logic, NVIDIA psy-op, ldmatrix.x4


GPU MODE ▷ #beginner (4 messages):

Working Groups, Open Projects


GPU MODE ▷ #off-topic (2 messages):

Job Search, Discord Communities for Job seekers, Networking for AI Jobs


GPU MODE ▷ #rocm (34 messages🔥):

ROCm 7.1, FBGEMM library broken, NPS partitioning crashes, kernel module problems


GPU MODE ▷ #self-promotion (10 messages🔥):

CUDA Kernel Naming, HMMA vs HFMA2.MMA, Register Moves in PTXAS, Cloud GPU marketplace


GPU MODE ▷ #submissions (4 messages):

nvfp4_gemm leaderboard, NVIDIA performance


GPU MODE ▷ #hardware (2 messages):

MI250, MI250X, Server Compatibility


GPU MODE ▷ #cutlass (5 messages):

Cute DSL, CUTLASS, Python 3.10, MMA Tiling


GPU MODE ▷ #teenygrad (1 messages):

LambdaLabs Grant, H100 Hours, SITP Textbook


GPU MODE ▷ #nvidia-competition (23 messages🔥):

NVFP4 GEMM, Kernel 2, cutlass.pipeline error, application did not respond


Latent Space ▷ #ai-general-chat (51 messages🔥):

Vibe CAD research from MIT DeCoDE Lab, Sakana's iconographic-linguage models, AntiGravity Performance Issues, OpenAI Router Rollback, New Warp Agents


Latent Space ▷ #private-agents (4 messages):

Google CC agent, Gmail AI productivity


Latent Space ▷ #genmedia-creative-ai (14 messages🔥):

WAN 2.6, Chatterbox Turbo, Meta SAM Audio


Nous Research AI ▷ #general (60 messages🔥🔥):

Local LLMs implementation, Non-language models for waveform analysis, Nvidia's dominance in GPU market, Meta's samaudio, Mistral creative model


Nous Research AI ▷ #research-papers (2 messages):

Byte Level LLMs


Nous Research AI ▷ #research-papers (2 messages):

Byte Level LLMs


Moonshot AI (Kimi K-2) ▷ #announcements (1 messages):

Kimi paid users, Kimi 30 minute chat


Moonshot AI (Kimi K-2) ▷ #general-chat (29 messages🔥):

Kimi Models Text-Only vs. Context, Kimi Non-Thinking Model, K2-Thinking Performance, Kimi Pricing and Availability, K2 Thinking Turbo


Eleuther ▷ #general (4 messages):

Synthema meta-language, Polyreflexeme theory, Algoverse AI research program, NSF SBIR proposal


Eleuther ▷ #research (11 messages🔥):

GAN, Research Collaboration, Paper Publishing


Eleuther ▷ #scaling-laws (1 messages):

uwu1468548483828484: is this wrong or right


Eleuther ▷ #interpretability-general (7 messages):

Superweights Impact, Attention Interpretation, Anthropic Circuits, Causal Head Gating


Manus.im Discord ▷ #general (12 messages🔥):

Manus 1.6 Release, AI & Full-Stack Engineer


aider (Paul Gauthier) ▷ #general (10 messages🔥):

OpenAI GPT-5, Aider Active Innovation, Aider copy-paste mode without LLMs, Aider Vision/Plans, Aider and interleaved reasoning tool calling


aider (Paul Gauthier) ▷ #links (1 messages):

Zenflow launch, Agent workflows


tinygrad (George Hotz) ▷ #general (2 messages):

AI pull requests policy, Understanding AI-generated code


DSPy ▷ #show-and-tell (1 messages):

justanotheratom: https://www.elicited.blog/posts/dspy-strategy-and-program


MLOps @Chipro ▷ #events (1 messages):

ggdupont: Anyone know about GenAI zurich conference? is it any good?


MCP Contributors (Official) ▷ #general-wg (1 messages):

Missing Thread Response, Contributor Apology