Frozen AI News archive

o3-mini launches, OpenAI on "wrong side of history

**OpenAI** released **o3-mini**, a new reasoning model available for free and paid users with a "high" reasoning effort option that outperforms the earlier **o1** model on STEM tasks and safety benchmarks, costing **93% less** per token. **Sam Altman** acknowledged a shift in open source strategy and credited **DeepSeek R1** for influencing assumptions. **MistralAI** launched **Mistral Small 3 (24B)**, an open-weight model with competitive performance and low API costs. **DeepSeek R1** is supported by **Text-generation-inference v3.1.0** and available via **ai-gradio** and replicate. The news highlights advancements in reasoning, cost-efficiency, and safety in AI models.

Canonical issue URL

AI News for 1/30/2025-1/31/2025. We checked 7 subreddits, 433 Twitters and 34 Discords (225 channels, and 9062 messages) for you. Estimated reading time saved (at 200wpm): 843 minutes. You can now tag @smol_ai for AINews discussions!

As planned even before the DeepSeek r1 drama, OpenAI released o3-mini, with the "high" reasoning effort option handily outperforming o1-full (and handily so in OOD benchmarks like Dan Hendrycks' new HLE and Text to SQL benchmarks, though Cursor disagrees):

image.png

The main area of R1 response was two fold: first a 63% cut in o1-mini and o3-mini prices, and second Sam Altman acknowledging in today's Reddit AMA that they will be showing "a much more helpful and detailed version" of thinking tokens, directly crediting DeepSeek R1 for "updating" his assumptions.

image.png

Perhaps more significantly, Sama also acknowledged being "on the wrong side of history" in their (not materially existent beyond Whisper) open source strategy.

You can learn more in today's Latent Space pod with OpenAI.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

Model Releases and Performance

Hardware, Infrastructure, and Scaling

Reasoning and Reinforcement Learning

Tools, Frameworks, and Applications

Industry and Company News


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. OpenAI's O3-Mini High: Versatile But Not Without Critics

Theme 2. OpenAI's $40Bn Ambition Amid DeepSeek's Challenge

Theme 3. DeepSeek vs. OpenAI: A Growing Rivalry

Theme 4. AI Self-Improvement: Google's Ambitious Push

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. US Secrecy Blocking AI Progress: Dr. Manning's Insights

Theme 2. Debate Over DeepSeek's Open-Source Model and Chinese Origins

Theme 3. Qwen Chatbot Launch Challenges Existing Models

Theme 4. Surge in GPU Prices Triggered by DeepSeek Hosting Rush

Theme 5. Mistral Models Advancement and Evaluation Results


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking (gemini-2.0-flash-thinking-exp)

Theme 1. OpenAI's o3-mini Model: Reasoning Prowess and User Access

Theme 2. DeepSeek R1: Performance, Leaks, and Hardware Demands

Theme 3. Aider and Cursor Embrace New Models for Code Generation

Theme 4. Local LLM Ecosystem: LM Studio, GPT4All, and Hardware Battles

Theme 5. Critique Fine-Tuning and Chain of Thought Innovations


PART 1: High level Discord summaries

Codeium (Windsurf) Discord


Unsloth AI (Daniel Han) Discord


aider (Paul Gauthier) Discord


Perplexity AI Discord


LM Studio Discord


Cursor IDE Discord


OpenRouter (Alex Atallah) Discord


Interconnects (Nathan Lambert) Discord


OpenAI Discord


Latent Space Discord


Yannick Kilcher Discord


Nous Research AI Discord


Stackblitz (Bolt.new) Discord


MCP (Glama) Discord


Stability.ai (Stable Diffusion) Discord


Eleuther Discord


GPU MODE Discord


Nomic.ai (GPT4All) Discord


Notebook LM Discord Discord


Torchtune Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Axolotl AI Discord


LLM Agents (Berkeley MOOC) Discord


OpenInterpreter Discord


Cohere Discord


LAION Discord


DSPy Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Codeium (Windsurf) ▷ #announcements (2 messages):

Cascade Updates, New Models, Web and Docs Search, User Milestones

Links mentioned:


Codeium (Windsurf) ▷ #discussion (329 messages🔥🔥):

DeepSeek R1 issues, Cascade tool call errors, Windsurf usage, Model performance comparison, OpenAI and data regulations

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (815 messages🔥🔥🔥):

DeepSeek Issues, Model Performance, Tool Calling Errors, O3 Mini Model Discussion, User Experience Feedback

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (999 messages🔥🔥🔥):

Unsloth AI, DeepSeek models, Fine-tuning techniques, Model quantization, Chatbot performance

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (8 messages🔥):

Front End Imperfections, Model Sensitivity, Output Detection Systems


Unsloth AI (Daniel Han) ▷ #help (319 messages🔥🔥):

DeepSeek R1 Dynamic Quantization, Finetuning LLMs, Using OpenWebUI, Learning Rate Adjustments, Multiple GPU Support

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (7 messages):

Qwen2.5-0.5B-instruct, Quadro P2000 GPU, Old Hardware Usability


Unsloth AI (Daniel Han) ▷ #research (6 messages):

vLLM Integration, Batch Throughput Investigation, Model Loading Concerns, XGB Usage in Unsloth and vLLM, Offloading Issues with vLLM


aider (Paul Gauthier) ▷ #announcements (1 messages):

Aider v0.73.0, o3-mini support, Reasoning effort argument, OpenRouter R1 free support

Link mentioned: Release history: Release notes and stats on aider writing its own code.


aider (Paul Gauthier) ▷ #general (979 messages🔥🔥🔥):

O3 Mini Performance, Tool Use in Aider, Rust Programming, OpenAI and Pricing, Linters in Aider

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (72 messages🔥🔥):

Aider Configuration, DeepSeek Issues, API Key Handling, Model Performance, File Management

Links mentioned:


Perplexity AI ▷ #general (705 messages🔥🔥🔥):

Perplexity AI models, O3 Mini release, O1 and R1 performance, DeepSeek model comparisons, User experiences with AI platforms

Links mentioned:


Perplexity AI ▷ #sharing (5 messages):

AI Prescription Bill, TB Outbreak Kansas, Nadella's AI Predictions, Asteroid Life Seeds, Harvard Dataset

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (2 messages):

Sonar Reasoning, Plane Crash Information


LM Studio ▷ #announcements (1 messages):

LM Studio 0.3.9 release, Idle TTL feature, Separate reasoning_content, Auto-update for runtimes, Nested folders support

Links mentioned:


LM Studio ▷ #general (362 messages🔥🔥):

LM Studio performance and model usage, AI models for C# development, OpenAI o3-mini release, DeepSeek model performance, Download speed issues in LM Studio

Links mentioned:


LM Studio ▷ #hardware-discussion (158 messages🔥🔥):

Qwen models, LM Studio performance, Using multiple GPUs, Vulkan support for GPUs, Context length in LLMs

Links mentioned:


Cursor IDE ▷ #general (520 messages🔥🔥🔥):

DeepSeek R1 and Sonnet 3.6 Integration, O3 Mini Performance, MCP Tool Usage, Claude Model Updates, User Experience and Feedback

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

o3-mini model release, Reasoning capabilities, BYOK program updates

Link mentioned: o3 Mini - API, Providers, Stats: OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. The model features three adjustable reasoning effort l...


OpenRouter (Alex Atallah) ▷ #general (445 messages🔥🔥🔥):

OpenRouter API Usage, Model Comparisons, O3-Mini Access, Claude 3.5 and AGI Discussions, Developer Insights and Suggestions

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (263 messages🔥🔥):

OpenAI's o3-mini launch, Performance comparisons with previous models, Real-world physics prompts, Pricing and access for developers, Model usability concerns

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (9 messages🔥):

Model Checkpoints, K2 Chat Release

Link mentioned: LLM360/K2-Chat · Hugging Face: no description found


Interconnects (Nathan Lambert) ▷ #ml-drama (1 messages):

xeophon.: https://x.com/OpenAI/status/1885413866961580526


Interconnects (Nathan Lambert) ▷ #random (63 messages🔥🔥):

DeepSeek Performance, o3-mini and R1 Comparison, Nvidia Digit Acquisition, Copy Editing in SemiAnalysis, Popularity and Critique in Media

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (12 messages🔥):

Mistral's Model Release, DeepSeek Reinforcement Learning, Janus Model Responses, Bengali Ghosthunters, AI Models Discussion

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (2 messages):

SFT Support, Open Source Projects, Funding Issues


Interconnects (Nathan Lambert) ▷ #reads (2 messages):

DeepSeek's Role in AI, Stargate Project Funding, AI Substack Community

Link mentioned: DeepSeek, unstacked: Jasmine Sun surveys reactions to the new AI on the block


Interconnects (Nathan Lambert) ▷ #policy (1 messages):

xeophon.: https://x.com/deliprao/status/1885114737525928380?s=61


OpenAI ▷ #annnouncements (1 messages):

OpenAI o3-mini, Reddit AMA, Future of AI, Sam Altman, Kevin Weil

Link mentioned: Reddit - Dive into anything: no description found


OpenAI ▷ #ai-discussions (319 messages🔥🔥):

O3 Mini Limit Confusion, Model Performance Comparisons, AI Detector Effectiveness, DeepSeek Discussion, CoT and Reasoning Models

Link mentioned: It doesn't matter if DeepSeek copied OpenAI — the damage has already been done in the AI arms race: Your move, Sam Altman


OpenAI ▷ #gpt-4-discussions (16 messages🔥):

File Upload Limitations in O1, Release of O3 Mini, ChatGPT Support Number Issues


OpenAI ▷ #prompt-engineering (2 messages):

Vision Model Limitations, User Discussions, Training Data Insights


OpenAI ▷ #api-discussions (2 messages):

4o User Feedback, Model Limitations, Training Data Insights


Latent Space ▷ #ai-general-chat (61 messages🔥🔥):

O3 Mini Updates, Performance Comparison with Sonnet, DeepSeek Impact, Market Trends in AI Models

Links mentioned:


Latent Space ▷ #ai-in-action-club (269 messages🔥🔥):

Discord Screenshare Issues, Open Source AI Tools, AI Tutoring Projects, Techno Music References, DeepSeek API

Links mentioned:


Yannick Kilcher ▷ #general (242 messages🔥🔥):

o3-mini release, AI model performance comparisons, LLM training and architectures, GPU configurations for AI models, OpenAI's competitive landscape

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (37 messages🔥):

Next Paper Review, Tülu 3 405B Benchmark Results, FP4 Training Framework, Collaboration Opportunities with DeepSeek, User Experiences with AI Chat Assistants

Links mentioned:


Yannick Kilcher ▷ #agents (6 messages):

Model Performance Comparison, Self-Awareness in AI, Grid Pattern Transformation, Qwen 2.5VL Features, Parameter Optimization


Yannick Kilcher ▷ #ml-news (11 messages🔥):

DeepSeek R1 Replication, Y Combinator's Funding Focus, OpenAI O3 Mini Features, AI Research Democratization

Links mentioned:


Nous Research AI ▷ #general (210 messages🔥🔥):

Psyche Project and Decentralized Training, Crypto and Its Relation to Nous, Performance Comparison of AI Models, Community Sentiment on AI and Crypto, o3-mini vs. Sonnet Performance

Links mentioned:


Nous Research AI ▷ #ask-about-llms (1 messages):

Autoregressive Generation on CLIP Embeddings, Multimodal Inputs, Stable Diffusion Generation


Nous Research AI ▷ #research-papers (4 messages):

Weekend plans, Reading materials

Link mentioned: Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch: Training of large language models (LLMs) is typically distributed across a large number of accelerators to reduce training time. Since internal states and parameter gradients need to be exchanged at e...


Nous Research AI ▷ #interesting-links (1 messages):

DeepSeek Hiring Strategy, Long-Term Success in AI Recruitment, Creativity vs. Experience

Link mentioned: Why DeepSeek's Founder Liang Wenfeng Prefers Inexperienced Hires - Bu…: no description found


Nous Research AI ▷ #research-papers (4 messages):

Weekend Plans, Image Sharing, Reading Materials

Link mentioned: Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch: Training of large language models (LLMs) is typically distributed across a large number of accelerators to reduce training time. Since internal states and parameter gradients need to be exchanged at e...


Nous Research AI ▷ #reasoning-tasks (2 messages):

Long term planning, Team involvement in planning


Stackblitz (Bolt.new) ▷ #prompting (13 messages🔥):

Supabase Issues, Troubleshooting Group Suggestions, HEIC File Support, Project Deletion Concerns


Stackblitz (Bolt.new) ▷ #discussions (142 messages🔥🔥):

Token Management, Issues with Web Containers, User Authentication Problems, Subscription Management, CORS Configuration Challenges

Link mentioned: Spongebob Squarepants Begging GIF - Spongebob Squarepants Begging Pretty Please - Discover & Share GIFs: Click to view the GIF


MCP (Glama) ▷ #general (112 messages🔥🔥):

MCP Server Setup, Transport Protocols in MCP, Remote vs Local MCP Servers, MCP Server Authentication, MCP CLI Tools

Links mentioned:


MCP (Glama) ▷ #showcase (9 messages🔥):

Authentication for Toolbase, YouTube Demo Feedback, Journaling MCP Server, Audio Playback Adjustment

Link mentioned: GitHub - mtct/journaling_mcp: MCP Server for journaling: MCP Server for journaling. Contribute to mtct/journaling_mcp development by creating an account on GitHub.


Stability.ai (Stable Diffusion) ▷ #general-chat (121 messages🔥🔥):

50 Series GPU Availability, Performance Comparison of GPUs, Running AI on Mobile Devices, AI Tools and Platforms, Stable Diffusion UI Changes

Links mentioned:


Eleuther ▷ #general (28 messages🔥):

Pythia language model, Inductive biases in AI, Pretraining hyperparameters, Logging and monitoring tools, Non-token CoT concept

Link mentioned: Overleaf, Online LaTeX Editor: An online LaTeX editor that’s easy to use. No installation, real-time collaboration, version control, hundreds of LaTeX templates, and more.


Eleuther ▷ #research (32 messages🔥):

Critique Fine-Tuning, Training Metrics for LLMs, Generalization vs Memorization, Random Order Autoregressive Models, Inefficiencies in Neural Networks

Link mentioned: Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate: Supervised Fine-Tuning (SFT) is commonly used to train language models to imitate annotated responses for given instructions. In this paper, we challenge this paradigm and propose Critique Fine-Tuning...


Eleuther ▷ #interpretability-general (3 messages):

Superhuman reasoning models, Backtracking vector discovery, Sparse autoencoders in reasoning, Propositional attitudes in AI, Mechanistic understanding vs. propositional attitudes

Links mentioned:


Eleuther ▷ #lm-thunderdome (29 messages🔥):

gsm8k evaluations, lm-eval harness settings, vllm integration and KV Cache, RWKV model configurations, performance metrics comparisons

Links mentioned:


GPU MODE ▷ #general (23 messages🔥):

Deep Seek's Model Performance, GPU Server Recommendations, Running LLM Benchmarks, Discussion on PTX in Open Source, Deep Learning GPU Considerations

Link mentioned: The Best GPUs for Deep Learning in 2023 — An In-depth Analysis: Here, I provide an in-depth analysis of GPUs for deep learning/machine learning and explain what is the best GPU for your use-case and budget.


GPU MODE ▷ #triton (2 messages):

Triton tensor indexing, Efficient column extraction, Mask and reduction technique


GPU MODE ▷ #cuda (10 messages🔥):

RTX 5090 FP4 Performance, NVIDIA FP8 Specification, CUDA and Python Integration, Flux Implementation Benchmarking, 100 Days of CUDA Resource Request


GPU MODE ▷ #torch (2 messages):

Torch Logs Debugging, CUDA Duplicate GPU Error


GPU MODE ▷ #cool-links (6 messages):

Riffusion platform, Audio generation artists, DeepSeek narrative, YouTube video on AI music, New research paper by Arthur Douillard et al.

Links mentioned:


GPU MODE ▷ #off-topic (8 messages🔥):

Salmon Patty Dish, Novelty Plate Discussion, Wework Critique, CEO Value Perception


GPU MODE ▷ #irl-meetup (1 messages):

NVIDIA GTC Discount


GPU MODE ▷ #liger-kernel (2 messages):

LigerDPOTrainer, Support for Liger-kernel losses

Link mentioned: [Liger] liger DPO support by kashif · Pull Request #2568 · huggingface/trl: What does this PR do?Add support for Liger-kernel losses for the DPO KernelNeeds: linkedin/Liger-Kernel#521


GPU MODE ▷ #reasoning-gym (30 messages🔥):

Proposed New Datasets for Reasoning Gym, GitHub Contributions to Reasoning Gym, Ideas for Game and Algorithm Development, Performance and Validation in Game Design, Dependency Management in Projects

Links mentioned:


Nomic.ai (GPT4All) ▷ #announcements (1 messages):

GPT4All v3.8.0 Release, DeepSeek-R1-Distill Support, Chat Templating Overhaul, Code Interpreter Fixes, Local Server Fixes


Nomic.ai (GPT4All) ▷ #general (61 messages🔥🔥):

DeepSeek integration, GitHub issues and updates, Voice processing inquiries, Model quantization differences, Custom functionality in Jinja templates

Links mentioned:


Notebook LM Discord ▷ #announcements (1 messages):

NotebookLM Usability Study, Participant Incentives, Remote Chat Sessions, User Feedback, Product Enhancement

Link mentioned: Participate in an upcoming Google UXR study!: Hello,I’m contacting you with a short questionnaire to verify your eligibility for an upcoming usability study with Google. This study is an opportunity to provide feedback on something that's cur...


Notebook LM Discord ▷ #use-cases (9 messages🔥):

Lake Lanao's endemic cyprinids, Podcast length limitation, NotebookLM YouTube content


Notebook LM Discord ▷ #general (47 messages🔥):

Gemini 2.0 Flash Issues, AI Narration Improvements, Notebook Sharing Difficulties, Google Workspace and NotebookLM Plus, Defined Terms in Documents

Links mentioned:


Torchtune ▷ #general (19 messages🔥):

Distributed GRPO, Memory Management in GRPO, Profiler for Memory Usage, VLLM Inference, Using bf16 for Training

Link mentioned: GitHub - RedTachyon/torchtune: PyTorch native post-training library: PyTorch native post-training library. Contribute to RedTachyon/torchtune development by creating an account on GitHub.


Torchtune ▷ #dev (27 messages🔥):

Gradient Accumulation Issues, TRL vs. Torchtune Config Differences, DPO Training Anomalies, Multinode Support in Torchtune, Loss Calculation Normalization

Links mentioned:


Modular (Mojo 🔥) ▷ #general (7 messages):

HPC resources and programming languages, DeepSeek's impact on AI compute demands, Mojo integration with VS Code, Modular cfg file issues, Clarifying tech details in blog series

Link mentioned: Modular: Democratizing Compute, Part 1: DeepSeek’s Impact on AI: Part 1 of an article that explores the future of hardware acceleration for AI beyond CUDA, framed in the context of the release of DeepSeek


Modular (Mojo 🔥) ▷ #mojo (19 messages🔥):

Backwards Compatibility in Libraries, Mojo 1.0 Benchmarking Delays, Swift Complexity Concerns, Mojo's Development Stability

Links mentioned:


Modular (Mojo 🔥) ▷ #max (3 messages):

Running DeepSeek with MAX, Using Ollama gguf files


LlamaIndex ▷ #blog (3 messages):

Meetup with Arize AI and Groq, LlamaReport beta release, o3-mini support


LlamaIndex ▷ #general (9 messages🔥):

OpenAI O1 Model Support, LlamaReport Usage, LLM Integration Issues

Link mentioned: Streaming support for o1 (o1-2024-12-17) (resulting in 400 "Unsupported value"): Hello, it appears that streaming support was added for o1-preview and o1-mini (see announcement OpenAI o1 streaming now available + API access for tiers 1–5). I confirm that both work for me. Howe...


tinygrad (George Hotz) ▷ #general (10 messages🔥):

Physical server for LLM, Tinygrad PR discussions, Kernel and buffers adjustments, PR title typos

Link mentioned: Tweet from the tiny corp (@tinygrad): one kernel, its buffers, and its launch dims, in tinygrad


Axolotl AI ▷ #general (10 messages🔥):

Axolotl AI support for bf16, fp8 training concerns, 8bit lora capabilities


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (9 messages🔥):

Certificate Release Updates, Quiz 1 Availability, Syllabus Section Confusion

Link mentioned: Quizzes Archive - LLM Agents MOOC: NOTE: The correct answers are in the black boxes (black text on black background). Highlight the box with your cursor to reveal the correct answer (or copy the text into a new browser if it’s hard to ...


OpenInterpreter ▷ #general (6 messages):

AI tool explanations, Farm Friend Application, iOS Shortcuts Patreon, NVIDIA NIM and DeepSeek


Cohere ▷ #discussions (1 messages):

the_lonesome_slipper: Thank you!


Cohere ▷ #api-discussions (1 messages):

Cohere Embed API v2.0, HTTP 422 Error, Preprocessing for Embeddings, Cross-language Polarization Research, Embed Multilingual Model


LAION ▷ #research (2 messages):

User Mentions, Thanks and Acknowledgment


DSPy ▷ #general (1 messages):

http_client parameter, dspy.LM configuration






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}