Frozen AI News archive

OpenAI o3, o4-mini, and Codex CLI

**OpenAI** launched the **o3** and **o4-mini** models, emphasizing improvements in **reinforcement-learning scaling** and overall efficiency, making **o4-mini** cheaper and better across prioritized metrics. These models showcase enhanced **vision** and **tool use** capabilities, though API access for these features is pending. The release includes **Codex CLI**, an open-source coding agent that integrates with these models to convert natural language into working code. Accessibility extends to **ChatGPT Plus, Pro, and Team users**, with **o3** being notably more expensive than **Gemini 2.5 Pro**. Performance benchmarks highlight the intelligence gains from scaling inference, with comparisons against models like **Sonnet** and **Gemini**. The launch has been well received despite some less favorable evaluation results.

Canonical issue URL

AI News for 4/15/2025-4/16/2025. We checked 9 subreddits, 449 Twitters and 29 Discords (211 channels, and 9942 messages) for you. Estimated reading time saved (at 200wpm): 782 minutes. You can now tag @smol_ai for AINews discussions!

As hinted on Monday, OpenAI launched the awkwardly named o3 and o4-mini in a classic livestream, together with a blogpost and a system card:

https://www.youtube.com/watch?v=sq8GBPUb3rk

the general message is improvements in both scaling RL:

image.png

and overall efficiency:

image.png

making o4-mini cheaper yet better across metrics that OAI has prioritized, vs the previous generation:

image.png

with much better vision and much better tool use - though this is not yet available in API.

Dan Shipper has a good qualitative review image.png

The system cards show slightly less flattering evals but overall the launch has been very very well received.

The "one more thing" was Codex CLI, which oneupped Claude Code (our coverage here) by being fully open source:

https://www.youtube.com/watch?v=FUq9qRwrDrI&t=6s


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

New Model Releases and Updates (o3, o4-mini, GPT-4.1, Gemini 2.5 Pro, Seedream 3.0)

Agentic Web Scraping with FIRE-1 and OpenAI's CodexCLI

Agent Implementations and Tool Use

Video Generation and Multimodality (Veo 2, Kling AI, Liquid)

Interpretability and Steering Research

Tools and Frameworks for LLM Development

Humor/Memes


AI Reddit Recap

/r/LocalLlama Recap

1. Recent OpenAI and Third-Party Model Releases

2. Large-Scale Model Training and Benchmarks

3. Community Projects and Hardware Setups

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo

1. OpenAI o3 and o4-mini Model Launch and Discussion

2. OpenAI o3/o4 vs Gemini Benchmarks and Comparisons

3. HiDream & ComfyUI Model Updates and Tools


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Exp

Theme 1: OpenAI's New Models: O3, O4-Mini, and Codex CLI

Theme 2: Emerging Hardware and Performance Challenges

Theme 3: Gemini 2.5 Pro and Related API Discussions

Theme 4: DeepSeek Models and Latent Attention

Theme 5: Community and Ethical Discussions


PART 1: High level Discord summaries

LMArena Discord


Manus.im Discord Discord


aider (Paul Gauthier) Discord


OpenRouter (Alex Atallah) Discord


OpenAI Discord


Cursor Community Discord


Unsloth AI (Daniel Han) Discord


Eleuther Discord


GPU MODE Discord


Latent Space Discord


Yannick Kilcher Discord


HuggingFace Discord


MCP (Glama) Discord


Nous Research AI Discord


LM Studio Discord


Notebook LM Discord


Modular (Mojo 🔥) Discord


Nomic.ai (GPT4All) Discord


Torchtune Discord


LlamaIndex Discord


Cohere Discord


LLM Agents (Berkeley MOOC) Discord


tinygrad (George Hotz) Discord


DSPy Discord


Codeium (Windsurf) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

LMArena ▷ #general (1266 messages🔥🔥🔥):

OpenAI's o3 and o4 mini models, Codex CLI experiment, GeoGuessr performance, Qwen 3 model, Tool Use


Manus.im Discord ▷ #showcase (2 messages):

Data Processing Website, Data Quality Discussions


Manus.im Discord ▷ #general (889 messages🔥🔥🔥):

Manus credits, Kling image-gen, Community building, Copilot, AI Ethics


aider (Paul Gauthier) ▷ #general (664 messages🔥🔥🔥):

Aider Coding Jokes, Breaking ToS Discussion, Context Compression Techniques, Gemini 2.5 Pro Limitations, OpenAI o3 and o4 Mini Release


aider (Paul Gauthier) ▷ #questions-and-tips (27 messages🔥):

Gemini structured output, Aider token handling, Aider Color Customization, Context-caching for Gemini 2.5 Pro, Aider Interruptions and File Additions


aider (Paul Gauthier) ▷ #links (6 messages):

Building Agents, Stack evolution, Reasonable take on new models, OpenAI Codex


OpenRouter (Alex Atallah) ▷ #announcements (10 messages🔥):

OpenAI o3, OpenAI o4-mini, Activity chart filtering, Chatroom SVG previews, Terms of service update


OpenRouter (Alex Atallah) ▷ #general (560 messages🔥🔥🔥):

Gemini 2.5 Pro rate limits, Privacy Policy concerns at OpenRouter, OpenAI O3 and O4-mini models, Deepseek models R3 and R4 releases, O4 Mini problems with image recognition


OpenAI ▷ #annnouncements (3 messages):

ChatGPT image library, OpenAI Livestream, OpenAI o3 and o4-mini


OpenAI ▷ #ai-discussions (386 messages🔥🔥):

LLM Hallucinations, Testing O3 vs. O4 Models, Data Integrity in API Calls, AI's Role in Content Creation, Gemini 2.5 Pro vs. OpenAI Models


OpenAI ▷ #gpt-4-discussions (24 messages🔥):

GPT-4.1-batch availability, GPT can make hands, Models accessing URLs, Deleting pictures from the library, GPT-4 retirement and custom GPTs


OpenAI ▷ #prompt-engineering (2 messages):

Image Prompts, VS bag, Kohls


OpenAI ▷ #api-discussions (2 messages):

Image Prompt, VS Bag, Kohls


Cursor Community ▷ #general (374 messages🔥🔥):

Token Calculation in Realtime, Gemini File Reading, Cursor Agent Mode Issues, GPT 4.1 vs Claude 3.7 vs Gemini, MongoDB vs Firebase/Supabase


Unsloth AI (Daniel Han) ▷ #general (243 messages🔥🔥):

Qwen2.5-VL-7B and Qwen2.5-VL-32B, step size in practice, tax evasion processes, SFT dataset with chain of thought, training dataset to rizz up the huzz


Unsloth AI (Daniel Han) ▷ #off-topic (38 messages🔥):

Gemini 2.5 Pro API, Thinking Content, Cursor's Implementation, EmailJS API Abuse, Phishing Website Takedown


Unsloth AI (Daniel Han) ▷ #help (32 messages🔥):

Tool/Function Calling, Fine-tuning Llama 3.1 8B, Fine-tuned models & Pipecat, Fine-tune Qwen2.5-VL on Video Data, Quantised deepseeks outside of llama cpp


Unsloth AI (Daniel Han) ▷ #research (6 messages):

DeepSeek-V3 Multihead Latent Attention, LLM performance penalties, Memory bandwidth bottleneck


Eleuther ▷ #general (264 messages🔥🔥):

Prompt Design Discussion, Symbolic Recursion in ChatGPT, AI-Generated Spam, KYC as Human Authentication


Eleuther ▷ #research (29 messages🔥):

Retinal OCT imaging, Cross-domain applicability, Multimodal data approaches


GPU MODE ▷ #general (1 messages):

erkinalp: https://x.com/PrimeIntellect/status/1912266266137764307


GPU MODE ▷ #triton (5 messages):

PMPP book vs Triton tutorial, matmul on RTX 5090, fp16 matrices, autotune


GPU MODE ▷ #cuda (17 messages🔥):

NVIDIA Nsight Compute Tutorials, CUDA Memory Usage, Dynamic Indexing in CUDA Kernels, GPU Profiling Talk from NVIDIA


GPU MODE ▷ #torch (3 messages):

torch.compile, AOTInductor, C++, Torchscript jit


GPU MODE ▷ #announcements (1 messages):

torch.compile, PyTorch, Richard Zou


GPU MODE ▷ #cool-links (1 messages):

Machine Learning, College lectures


GPU MODE ▷ #jobs (1 messages):

PyTorch, OSS, GPU, Systems Engineering, Code Optimization


GPU MODE ▷ #beginner (8 messages🔥):

GPU Mode Lecture Series, CUDA variable registers, NVCC inlining device functions, PTX vs SASS compilation


GPU MODE ▷ #torchao (7 messages):

PARQ in torchao, PTQ weight only quant for BMM in torchao, Precision for meta-parameters z and s, AWQ uses integer zeros


GPU MODE ▷ #off-topic (2 messages):

Tuberculosis sanatorium, Novelty plate


GPU MODE ▷ #rocm (4 messages):

AMD Cloud Profiling, Cloud Vendor Tier List


GPU MODE ▷ #liger-kernel (5 messages):

Liger Kernel Meetings, Liger Kernel and FSDP2 compatibility, TP+FSDP2+DDP


GPU MODE ▷ #metal (1 messages):

candle, metal, kernels


GPU MODE ▷ #self-promotion (4 messages):

GPU Access, MCP, Job Opportunity


GPU MODE ▷ #general (25 messages🔥):

Pytorch HIP/ROCm compilation issues, MI300 benchmarking errors, Popcorn CLI registration and submission issues


GPU MODE ▷ #submissions (21 messages🔥):

Grayscale Leaderboard Updates, Matmul Performance on T4, AMD FP8 MM Leaderboard Domination, Conv2d Performance on A100, AMD Identity Leaderboard Results


GPU MODE ▷ #status (7 messages):

CLI Tool Release, Discord Oauth2 Issues, FP8-mm Task


GPU MODE ▷ #feature-requests-and-bugs (16 messages🔥):

File containing backslash causes submission error, Service Unavailable error, CLI new release


GPU MODE ▷ #hardware (2 messages):

Zero to ASIC Course, Silicon Chip Design


GPU MODE ▷ #amd-competition (87 messages🔥🔥):

AMD Developer Challenge, FP8 GEMM details, Tolerances too tight, Submission file types


Latent Space ▷ #ai-general-chat (93 messages🔥🔥):

Kling 2, BM25 for code, Grok adds canvas, GPT 4.1 Review, O3 and O4 mini launch


Latent Space ▷ #llm-paper-club-west (80 messages🔥🔥):

Zoom Outage, RWKV-block v7, PicoCreator QKV Transformers, Memory capability and freezing state


Yannick Kilcher ▷ #general (145 messages🔥🔥):

Authorship and AI, AI patents, Noise vs Uncertainty, o3 and o4-mini, Codex Security


Yannick Kilcher ▷ #paper-discussion (12 messages🔥):

Ultra-Scale Playbook, CUDA Memory Usage, DeepSeek Maths, Multimodal Series


Yannick Kilcher ▷ #ml-news (10 messages🔥):

CVE Depreciation, DHS Funding, OpenAI Windsurf, Cyber Vulnerability Database


HuggingFace ▷ #general (46 messages🔥):

Modal free credit, Image Generation Models, Hugging Face inference endpoint issues, AMD GPUs vs NVIDIA, Agents course deadline


HuggingFace ▷ #today-im-learning (2 messages):

Cool Project, Image Analysis


HuggingFace ▷ #cool-finds (7 messages):

Grok 3 benchmarks, xAI datacenter, Nuclear power vs fossil fuels, Portable microreactors, China energy production


HuggingFace ▷ #i-made-this (12 messages🔥):

LogGPT Safari Extension, Local LLM Platform, Speech-to-Speech AI, Wildlife Animal Classifier, CodeFIM Dataset


HuggingFace ▷ #computer-vision (1 messages):

Stage Channel Location, Event Notification


HuggingFace ▷ #NLP (1 messages):

LLM Chat Templates, Python Glue


HuggingFace ▷ #smol-course (3 messages):

Certification Date, Intro Docs


HuggingFace ▷ #agents-course (62 messages🔥🔥):

Use Case Assignments, Deadlines Moved, Final Certification, Proposed Assignments


MCP (Glama) ▷ #general (111 messages🔥🔥):

Claude Desktop tool execution failures with large responses, MCP draw.io server availability, MCP vs AI Agents with tools, Creating an MCP server with Wolfram Language, Docker container security credentials for MCP


MCP (Glama) ▷ #showcase (22 messages🔥):

MCP Bidirectional Communication, BlazeMCP, Orchestrator Agent for MCP


Nous Research AI ▷ #general (97 messages🔥🔥):

Altman vs Musk, OpenAI social network, LLMs running social networks, AI subscriptions deal, o4-mini token count


Nous Research AI ▷ #research-papers (4 messages):

LLM performance, Life-threatening prompts


Nous Research AI ▷ #interesting-links (2 messages):

LLaMaFactory guide, Qwen 1.8 Finetuning


Nous Research AI ▷ #research-papers (4 messages):

Life-threatening prompts effect on LLMs, LLMs as human simulators


LM Studio ▷ #general (33 messages🔥):

LM Studio multi-LLM use, Gemma 3 Language Translation, NVMe SSD Speed for LM Studio, BitNet greetings


LM Studio ▷ #hardware-discussion (71 messages🔥🔥):

GPU inference, Apple M4 Max chip, Dual GPU Support, Nvidia card heating issues, PCIE SSD adapter


Notebook LM ▷ #use-cases (10 messages🔥):

Notebook LM with Microsoft Documentation, Google Docs vs OneNote, German-language podcasts generation problems


Notebook LM ▷ #general (82 messages🔥🔥):

Podcast Language Support, LaTeX Support, Bulk Upload, Mindmap Generation


Modular (Mojo 🔥) ▷ #general (10 messages🔥):

Mojo, Arch Linux, GPU support, Conda, Community Meeting


Modular (Mojo 🔥) ▷ #mojo (58 messages🔥🔥):

Kernel Calls in Mojo, Mojo Compiler Performance, HVM/Bend opinions, Performance Regression Testing


Nomic.ai (GPT4All) ▷ #general (45 messages🔥):

GPT4All offline use, LM Studio as alternative, Ingesting books into models, GGUF version compatibility, GPT4All development status


Torchtune ▷ #general (1 messages):

Office Hours


Torchtune ▷ #dev (41 messages🔥):

Validation Set PR, KV Cache Management, Config Revolution, tokenizer path annoyance, tune command name collision


LlamaIndex ▷ #blog (3 messages):

Jerry at AI User Conference, AI Solutions for Investment Professionals in NY, LlamaIndex support for o3 and o4-mini


LlamaIndex ▷ #general (26 messages🔥):

Pinecone multiple namespaces, LlamaIndex Agents with MCP Servers, LLM.txt, Base64 PDF support, Google A2A implementation with LlamaIndex


Cohere ▷ #「💬」general (4 messages):

Command A token loops, FP8 arguments, vllm Community Collaboration


Cohere ▷ #「🔌」api-discussions (1 messages):

Embed-v4.0 supports 128K tokens, API support embedding more than 1 image per request, Late Chunk strategy


Cohere ▷ #「🤝」introductions (2 messages):

Open Source Chat Interface, AI Tooling, Cohere Model Understanding, Fintech Founder


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (4 messages):

MOOC labs release, MOOC coursework deadlines, Berkeley students vs. MOOC students, MOOC labs ETA


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (2 messages):

Verifiable Outputs, Lean Auto-Formalizer, Formal Verification of Programs, Automated Proof Generation


tinygrad (George Hotz) ▷ #learn-tinygrad (3 messages):

MNIST Tutorial error, diskcache_clear() fix, OperationalError


DSPy ▷ #papers (2 messages):

HuggingFace Paper


Codeium (Windsurf) ▷ #announcements (2 messages):

o4-mini, Windsurf, Free Access, New Channel, Changelog




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}