Frozen AI News archive

gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API

**OpenAI** officially launched the **gpt-image-1** API for image generation and editing, supporting features like alpha channel transparency and a "low" content moderation policy. **OpenAI's** models **o3** and **o4-mini** are leading in benchmarks for style control, math, coding, and hard prompts, with **o3** ranking #1 in several categories. A new benchmark called **Vending-Bench** reveals performance variance in LLMs on extended tasks. **GPT-4.1** ranks in the top 5 for hard prompts and math. **Nvidia's** **Eagle 2.5-8B** matches **GPT-4o** and **Qwen2.5-VL-72B** in long-video understanding. AI supercomputer performance doubles every 9 months, with **xAI's Colossus** costing an estimated $7 billion and the US dominating 75% of global performance. The Virology Capabilities Test shows **OpenAI's o3** outperforms 94% of expert virologists. **Nvidia** also released the **Describe Anything Model (DAM)**, a multimodal LLM for detailed image and video captioning, now available on Hugging Face.

Canonical issue URL

Autoregressive Imagegen is all you need.

AI News for 4/22/2025-4/23/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (213 channels, and 6203 messages) for you. Estimated reading time saved (at 200wpm): 503 minutes. You can now tag @smol_ai for AINews discussions!

When Imagegen launched it was specifically branded as a capability of GPT 4o. With the Ghibli wave everyone rushed to create convoluted browser automations to "apify" a nonexistent imagegen API.

Now, the offical API is here (docs), cost), capable of new generations (using references) as well as partial/full image editing (using masks).

https://cdn.openai.com/API/docs/images/images-gallery/furniture-poster.png

It supports alpha channel transparency and, in a first for OpenAI, a "low" content moderation policy, as well as (as Kevin Weil notes):


AI Twitter Recap

Language Models and Performance

New Models and Releases

Research and Papers

AI Agents and Tooling

ML Engineering and Deployment

Other

Humor


AI Reddit Recap

/r/LocalLlama Recap

1. New Vision-Language Model and Benchmark Releases (Meta PLM, SkyReels-V2)

2. DeepSeek Model Architecture Educational Series

3. Portable LLM Utilities and User Experiences

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. Anthropic Claude AI Analysis and Workplace Autonomy Predictions

2. OpenAI o3/o4-mini Performance and Benchmarks

3. Recent Text-to-Video Model Launches and Community Reviews


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.5 Pro Exp

Theme 1: Model Mania - New Releases and API Rollouts

Theme 2: Platform Power-Ups and Integration Innovations

Theme 3: Under the Hood - Kernels, Quantization & Attention

Theme 4: Benchmark Brouhahas and Performance Puzzles

Theme 5: User Friction - Bugs, Limits, and Login Lockouts


PART 1: High level Discord summaries

Perplexity AI Discord


LMArena Discord


Manus.im Discord Discord


OpenRouter (Alex Atallah) Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


aider (Paul Gauthier) Discord


Eleuther Discord


GPU MODE Discord


Cursor Community Discord


OpenAI Discord


Notebook LM Discord


HuggingFace Discord


Yannick Kilcher Discord


Modular (Mojo 🔥) Discord


Latent Space Discord


LlamaIndex Discord


MCP (Glama) Discord


tinygrad (George Hotz) Discord


DSPy Discord


Torchtune Discord


Nous Research AI Discord


LLM Agents (Berkeley MOOC) Discord


Cohere Discord


Gorilla LLM (Berkeley Function Calling) Discord


MLOps @Chipro Discord


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Nomic.ai (GPT4All) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Perplexity AI ▷ #announcements (1 messages):

iOS Voice Assistant, Multi-app Actions, Mobile App Update


Perplexity AI ▷ #general (1104 messages🔥🔥🔥):

Perplexity AI Terms of Service, James Webb telescope, Perplexity comet release date, R1o4-mini vs grok 3 vs gemini 2.5 pro vs claude, Raycast a web app


Perplexity AI ▷ #sharing (3 messages):

Perplexity AI, comprehensive 10000


Perplexity AI ▷ #pplx-api (11 messages🔥):

API key revoke, Requests per API, Office hours


LMArena ▷ #general (1097 messages🔥🔥🔥):

Llama games LM arena, emoji, GPT-4.1, small cheap models, arc prize


Manus.im Discord ▷ #general (590 messages🔥🔥🔥):

Manus Pricing, DeepSeek vs Manus, Genspark, Credits, OpenAI vs Manus


OpenRouter (Alex Atallah) ▷ #announcements (8 messages🔥):

Sonnet 3.7 capacity issues, Clerk authentication delays, OpenRouter PDF support, PDF processing engines, Gemini API PDF Input


OpenRouter (Alex Atallah) ▷ #general (259 messages🔥🔥):

Tool Calling Limitations, Google Gemma Quantization, Gemini Search Grounding, Account Creation Issues, Free Model Function Calling


Unsloth AI (Daniel Han) ▷ #general (172 messages🔥🔥):

Scout use cases with 128GB unified memory, Unsloth support for GLM-4 9B/32B models, Unsloth Dynamic v2.0 quants release, Torch 2.7 release and its changes, Evaluation benchmarks: MMLU, Humaneval, Aider Polygot


Unsloth AI (Daniel Han) ▷ #off-topic (6 messages):

Token processing speed, PyTorch benchmarks


Unsloth AI (Daniel Han) ▷ #help (53 messages🔥):

GRPO Models, Classification task using numbered labels, Llama-4 finetuning updates, Embedding model recommendations, Fine-tuning Llama for Vtuber


Unsloth AI (Daniel Han) ▷ #research (27 messages🔥):

Reasoning Models, LLM Novelty, Training Data Limitations, Sampling Token Sequences


LM Studio ▷ #general (170 messages🔥🔥):

SillyTavern as front end for LM Studio, Pinokio install, 5090 loading issues on LM Studio, Cuda 12, Bitnet CPP


LM Studio ▷ #hardware-discussion (36 messages🔥):

RTX 5060 Ti 16GB, Macbook for LLM, 5090 finetuning, DDR3 for AI, Smol models


aider (Paul Gauthier) ▷ #general (171 messages🔥🔥):

Gemini, Ollama, Aider Benchmarks, Cursor IDE, Deepseek R2


aider (Paul Gauthier) ▷ #questions-and-tips (30 messages🔥):

Aider exclude yes/no responses, Load context readonly, Good model combinations, Aider leaderboard, Gemma 27b image


Eleuther ▷ #general (12 messages🔥):

RL Agents, Continuous Signs, Multi-Agent RL, Communication Protocols


Eleuther ▷ #research (80 messages🔥🔥):

Linear representation hypothesis debunked, Biologically inspired architecture for sequential modeling, Native Sparse Attention analysis, Overfitting models to single datapoints, AI-generated research papers


Eleuther ▷ #interpretability-general (75 messages🔥🔥):

Multihead Latent Attention (MLA), DeepSeek, RWKV architecture, Residual Stream Subspaces


GPU MODE ▷ #general (1 messages):

PyTorch, flashStream, Kernels, Leaderboard


GPU MODE ▷ #triton (11 messages🔥):

Triton FP4 support, FP16 to FP4 conversion, TileLang for FP4, FP4 vs INT4 benchmarks, Pyright and Triton issues


GPU MODE ▷ #cuda (1 messages):

RightNow AI, CUDA kernels, browser coding, bottleneck analysis


GPU MODE ▷ #torch (1 messages):

marksaroufim: would love some feebdack on https://github.com/pytorch/pytorch/issues/152032


GPU MODE ▷ #cool-links (1 messages):

MLA kernel, Compute bound inference


GPU MODE ▷ #beginner (3 messages):

ncu, import-source, app-range, collective op


GPU MODE ▷ #torchao (27 messages🔥):

torch.compile static cache, Qwen2.5-3B performance, fp16 performance, vLLM's compression kernel


GPU MODE ▷ #irl-meetup (2 messages):

PyTorch ATX Meetup, Triton, Austin, Red Hat, Intel


GPU MODE ▷ #tilelang (3 messages):

TileLang, CUDA, Triton


GPU MODE ▷ #gpu模式 (1 messages):

Tensor Parallelism, Static Split-K


GPU MODE ▷ #submissions (32 messages🔥):

A100 Grayscale, AMD MI300 FP8, AMD MI300 Identity, L4 Grayscale, H100 Grayscale


GPU MODE ▷ #status (11 messages🔥):

AMD, Code Server Access, Leaderboard, Profiling


GPU MODE ▷ #amd-competition (63 messages🔥🔥):

HIP vs Inline, Registration Confirmation Delay, AMD Employee Leaderboard Visibility, Submission File Limitations, Numpy Error


GPU MODE ▷ #cutlass (3 messages):

Modal.com credits, CUDA fp6 type, CU_TENSOR_MAP_DATA_TYPE_16U6_ALIGN16B format


Cursor Community ▷ #general (149 messages🔥🔥):

o4-mini errors, keybindings breaking, Gemini and Claude combos, Cursor slowdowns, Windsurf vs Cursor


OpenAI ▷ #annnouncements (1 messages):

GPT Image 1, Image Generation API


OpenAI ▷ #ai-discussions (97 messages🔥🔥):

Gemini 2.5 Pro vs Gemini 2.5 Flash, AI replacing jobs, Sora ETA, o3 struggling with high school geometry, ChatGPT app vs webapp


OpenAI ▷ #gpt-4-discussions (7 messages):

AI Model Mistakes, Plus Plan Chats and Memories, GPT Image 1


OpenAI ▷ #prompt-engineering (7 messages):

AI fiction writing assistant, AI for tax assistance, AI Python coding assistant, Defining interesting story prompts


OpenAI ▷ #api-discussions (7 messages):

AI Story Prompts, Defining 'Interesting' Prompts, Realistic Stories Across Genres


Notebook LM ▷ #use-cases (14 messages🔥):

NLM for Exam Prep with Anki, NLM and Client Test Results, NLM German Prompts, NotebookLM Data Training, NotebookLM Long Overviews


Notebook LM ▷ #general (75 messages🔥🔥):

NotebookLM math support, Gemini 2.5 Pro, Audio Overview language support, PDF handling in NotebookLM, Grounding and search in AI models


HuggingFace ▷ #general (67 messages🔥🔥):

Agents vs Humans, Hugging Face Spaces Issue, Llama 3 Chat Template, Fine-tuning Llama for VTuber, Continuous Pretraining Datasets


HuggingFace ▷ #today-im-learning (1 messages):

Pomodoro Technique, Time management


HuggingFace ▷ #cool-finds (3 messages):

Model Size Disclosure, YouTube Channel for Fine-tuning Tutorials


HuggingFace ▷ #i-made-this (5 messages):

AI-Powered Document Q&A Project, LLM Fundamentals for Cybersecurity, Resume Matching App


HuggingFace ▷ #NLP (1 messages):

Embedding Models for Short Contexts, QA Embedding Pairs, Context Length Optimization


HuggingFace ▷ #agents-course (8 messages🔥):

Arxiv paper deadline extension, Loops and Branches Notebook Bug, Agents course joining, HuggingFace course credit issue


Yannick Kilcher ▷ #general (72 messages🔥🔥):

Brain processing locality, Saturday Paper Discussions recordings, Anthropic's recent paper, Hebbian theory vs Brain Physics, mental model vs world model


Yannick Kilcher ▷ #paper-discussion (3 messages):

Muon, Adam replacement, reverse Distillation


Yannick Kilcher ▷ #ml-news (3 messages):

Links to YouTube videos


Modular (Mojo 🔥) ▷ #general (8 messages🔥):

Zed Project Diagnostics, Modular Meetup, MAX/MOJO License


Modular (Mojo 🔥) ▷ #mojo (43 messages🔥):

Mojo Training Pipelines, Mechanical Migrator Tool, Pythonic Mojo Design Tradeoffs, Zero-Cost Abstraction in Mojo, Enums in Mojo


Latent Space ▷ #ai-general-chat (39 messages🔥):

Brainy RTX 4090 AI supercomputer, OAI image gen in API, Tinybox competitor, Scout.new cooking


Latent Space ▷ #ai-announcements (1 messages):

swyxio: new lightning pod https://youtu.be/aDiEQngFsFU


LlamaIndex ▷ #blog (2 messages):

LlamaIndex Milvus full-text search, Agentic Document Workflow


LlamaIndex ▷ #general (35 messages🔥):

LlamaParse getText() issue, Document hash computation, Passing userID to MCP tools, MLflow autolog with Llamaindex and FastAPI, Workflows Checkpoints usage and alteration


LlamaIndex ▷ #ai-discussion (3 messages):

Instruction-Finetuning LLMs, TRL for Finetuning, Memory Constraints for LLMs


MCP (Glama) ▷ #general (26 messages🔥):

MCP Interview, README Translation Automation, AWSLab Cost Analysis MCP Server Issue, MCP Inspector Timeout Error, Cursor MCP Tool Error


MCP (Glama) ▷ #showcase (4 messages):

MCP Server, Klavis AI Eval Platform, Browser Extension for MCP, Siloed AI Drag and Drop


tinygrad (George Hotz) ▷ #general (21 messages🔥):

arithmetic shift right op, UPat matching a CONST, multiple patterns matching, instruction ordering and register assignment, closures reconsideration


tinygrad (George Hotz) ▷ #learn-tinygrad (6 messages):

Arange Optimization, Indexed Operations for STs, UOps and Buffers relationship


DSPy ▷ #show-and-tell (1 messages):

dbreunig: https://www.dbreunig.com/2025/04/18/the-wisdom-of-artificial-crowds.html


DSPy ▷ #general (22 messages🔥):

DSPy 3.0, Synthetic Flywheel, Prompt Optimization, Databricks event SFO


Torchtune ▷ #dev (8 messages🔥):

RoPE implementation, Collective scheduling, Tune cp workflow, Library design


Torchtune ▷ #rl (1 messages):

Future Meeting


Nous Research AI ▷ #general (8 messages🔥):

Tool Integrations with SaaS Platforms, New Model Release, SuperNova Models by Arcee-AI


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (3 messages):

Resource Submission Form, Team Name


LLM Agents (Berkeley MOOC) ▷ #mooc-readings-discussion (2 messages):

MOOC readings, LLMAgents-learning.org


Cohere ▷ #「💡」projects (1 messages):

Hugging Face Inference API, Flask Website Integration, Model Deployment


Cohere ▷ #「🤝」introductions (2 messages):

Hugging Face, Flask, Model uploading


Gorilla LLM (Berkeley Function Calling) ▷ #leaderboard (3 messages):

Debugging Handler Errors, Code Modification Suggestions


MLOps @Chipro ▷ #events (1 messages):

Legislative AI/Tech Webinar, BillTrack50, AI4Legislation competition