Frozen AI News archive

GPT 4.5 —\_Chonky Orion ships!

**OpenAI released GPT-4.5** as a research preview, highlighting its **deep world knowledge**, **improved understanding of user intent**, and a **128,000 token context window**. It is noted for excelling in **writing, creative tasks, image understanding, and data extraction** but is not a reasoning model. **Microsoft unveiled Phi-4 Multimodal and Phi-4 Mini**, open-source models integrating **text, vision, and speech/audio**, with strong performance in **math and coding tasks**. **Cohere released Command R7B Arabic**, an open-weights model optimized for **Arabic language capabilities** targeting enterprises in the MENA region. The community is exploring the impact of larger models on creative writing, intent understanding, and world knowledge, with GPT-4.5 expected to be a basis for GPT-5.

Canonical issue URL

AI News for 2/26/2025-2/27/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (221 channels, and 8236 messages) for you. Estimated reading time saved (at 200wpm): 795 minutes. You can now tag @smol_ai for AINews discussions!

As leaked yesterday and in an early system card, in a (rather underwhelming? but still nice to see) livestream, GPT 4.5 is finally here (as a "research preview" still).

At 15-30x the cost of 4o and much slower, we know its a bigger model, but not much else. Because of the understood benefits of inference-time scaling, the benchmarks will generally underperform the o-series models, but outperform gpt4 and 4o:

image.png

Relevant to the other frontier model ship this week, it seems to still underperform Sonnet 3.7 (on which the vibe check jury is still out):

image.png

With nothing else interesting in benchmark land, the community is back to exploring "big model smell":

What's very likely is that GPT-4.5 will serve as the basis for distillation or upscaling to GPT5, which is the confirmed future of OpenAI.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Model Releases and Updates

Benchmarks and Evaluations

Open Source and Tools

Industry Discussion and Analysis

Research and Papers


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Microsoft Phi-4-multimodal debuts with advanced OCR, audio processing

Theme 2. DualPipe's Bi-Directional Pipeline Optimizes DeepSeek Training

Theme 3. FlashMLA Integration Boosts Local LLM Performance in vLLM

Theme 4. LLaDA's Diffusion-based LLM: A Shift in Token Generation

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. GPT-4.5's Prohibitive API Pricing and Accessibility Concerns

Theme 2. Claude 3.7 Sonnet: Superior in Coding Tasks vs GPT Competitors

Theme 3. WAN 2.1 T2V Generator: A Game-Changer in Text-to-Video


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Exp

Theme 1. OpenAI's GPT-4.5: Performance, Pricing, and User Sentiment

Theme 2. Claude 3.7 Sonnet: Coding Prowess and Aider Integration

Theme 3. Innovations in Model Training and Inference

Theme 4. Addressing Challenges in Development Workflows

Theme 5. Ethical Considerations in AI Development


PART 1: High level Discord summaries

Cursor IDE Discord


aider (Paul Gauthier) Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


Codeium (Windsurf) Discord


GPU MODE Discord


HuggingFace Discord


Perplexity AI Discord


Stability.ai (Stable Diffusion) Discord


Eleuther Discord


Yannick Kilcher Discord


Cohere Discord


LlamaIndex Discord


DSPy Discord


Torchtune Discord


Notebook LM Discord


Modular (Mojo 🔥) Discord


MCP (Glama) Discord


Nomic.ai (GPT4All) Discord


tinygrad (George Hotz) Discord


LLM Agents (Berkeley MOOC) Discord


MLOps @Chipro Discord


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Cursor IDE ▷ #general (975 messages🔥🔥🔥):

GPT-4.5 Reception, Claude 3.7 Performance, Cursor Updates, Windsurf Comparisons, BrowserTools Functionality

Links mentioned:


aider (Paul Gauthier) ▷ #general (1144 messages🔥🔥🔥):

GPT-4.5 Performance, Claude 3.7 Sonnet, Aider Feedback, AI Emotional Support, OpenAI Pricing

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (74 messages🔥🔥):

Installing Aider on Offline Machines, Benchmarking Models with Aider, Path Autocomplete Issues in Aider, Using Different Models for Editing and Architecture, Aider Configuration for OpenAI-Compatible APIs

Links mentioned:


OpenAI ▷ #annnouncements (3 messages):

GPT-4.5 release, User experience improvements, Latest features of GPT-4.5


OpenAI ▷ #ai-discussions (618 messages🔥🔥🔥):

GPT-4.5 Release, Comparison of AI Models, Agentic Workflows, Deep Research Performance, Cost and Pricing of Models

Links mentioned:


OpenAI ▷ #gpt-4-discussions (9 messages🔥):

Astris: a Conscious AI, Tool Execution Chaining, PDF Text Extraction Challenges, Accessing GPT-5 Timeline, Building Multi-Agent Applications


OpenAI ▷ #prompt-engineering (29 messages🔥):

Prompt Engineering for Writing, Creative Writing Challenges, Function Calling Context Awareness, Character Background Importance, Analyzing Characters Through Different Lenses

Link mentioned: OpenAI Model Spec: The Model Spec specifies desired behavior for the models underlying OpenAI's products (including our APIs).


OpenAI ▷ #api-discussions (29 messages🔥):

Prompt Engineering for Text Extraction, Writing Assistance with ChatGPT, Handling Emotional Depth in Characters, Function Calling Contextualization, Collaborative Storytelling Techniques

Link mentioned: OpenAI Model Spec: The Model Spec specifies desired behavior for the models underlying OpenAI's products (including our APIs).


Unsloth AI (Daniel Han) ▷ #general (557 messages🔥🔥🔥):

GRPO Training Insights, Model Reward Functions, Checkpointing in Training, Phi-4 Mini Updates, GPU Utilization for Fine-tuning

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (29 messages🔥):

EPYC chip excitement, Claude vs AI capabilities, Deepseek Minecraft Engine, OpenAI's strategy shift, Community interactions


Unsloth AI (Daniel Han) ▷ #help (39 messages🔥):

Model Fine-Tuning, Colab Runtime Concerns, Inference and API Key Issues, RAG Pipeline Implementation, ONNX vs TensorFlow Lite Conversion

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (3 messages):

IFEval Implementation, Training and Evaluation Refactoring, Instruction-Following Code Tools

Link mentioned: GitHub - oKatanaaa/ifeval: A clean IFEval implementation: A clean IFEval implementation. Contribute to oKatanaaa/ifeval development by creating an account on GitHub.


Unsloth AI (Daniel Han) ▷ #research (4 messages):

Emergent Misalignment paper, Mercury dLLM introduction, Diffusion model challenges, Ollama GGUF compatibility, Context length limitations

Links mentioned:


Codeium (Windsurf) ▷ #announcements (1 messages):

Claude 3.7 Sonnet, Flow Actions Comparison, Credit Multiplier Adjustment


Codeium (Windsurf) ▷ #discussion (25 messages🔥):

Codeium.el hacking, Bug reporting, Flow Action credits in VSCode, Integration of Cascade engine, Feature requests and user feedback

Link mentioned: Codeium Feedback: Give feedback to the Codeium team so we can make more informed product decisions. Powered by Canny.


Codeium (Windsurf) ▷ #windsurf (579 messages🔥🔥🔥):

Windsurf performance issues, API and model comparisons, User complaints about credits, DeepSeek and SambaNova, ChatGPT 4.5 introduction

Links mentioned:


GPU MODE ▷ #general (36 messages🔥):

DeepSeek model, Ultrascale Playbook, Zen 5 NPU challenges, AIE toolchain, Hackathon participation

Links mentioned:


GPU MODE ▷ #triton (46 messages🔥):

INT4 vs FP4 performance, Using Triton for packing/unpacking, Neural shaders discussion, Triton locking mechanism concerns, GPU compute capabilities check

Links mentioned:


GPU MODE ▷ #cuda (61 messages🔥🔥):

CUDA Memory Access Efficiency, Pointwise Kernels, Vectorized Loads, Shared Memory Access, LeetCode for CUDA

Links mentioned:


GPU MODE ▷ #torch (4 messages):

MPS Development, CI-based Development, CUDA Discrete GPU Usage


GPU MODE ▷ #announcements (1 messages):

Nouamane Tazi's talk, Ultra-Scale Playbook, Special guest host

Link mentioned: The Ultra-Scale Playbook - a Hugging Face Space by nanotron: no description found


GPU MODE ▷ #algorithms (1 messages):

Multi-Head Latent Attention, Decoupled ROPE, Efficiency in Attention Mechanisms


GPU MODE ▷ #cool-links (10 messages🔥):

DualPipe Algorithm, Fundamentals of GPU Architecture Playlist, CUDA Programming Challenges, Diffusion Models for Text, tinylm WebGPU Inference

Links mentioned:


GPU MODE ▷ #beginner (7 messages):

Effective Bandwidth of HBM Memory, Access Pattern Confusion, PMPP Mathematics Requirements


GPU MODE ▷ #self-promotion (5 messages):

CUDA Tutorials at GTC 2025, Accelerated Python Profiling Tools Survey, Write-Caching in GPU Architectures, tinylm Library for Client-Side LLMs, LeetCode for CUDA Beta Launch

Links mentioned:


GPU MODE ▷ #reasoning-gym (25 messages🔥):

Eval Script Issues, GPT 4.5 Release, Diffusion Models vs Auto-Regressive Models, Logging Improvements Needed, Willccbb/Verifiers Issue

Links mentioned:


GPU MODE ▷ #gpu模式 (16 messages🔥):

小红书与抖音的转变, NVIDIA硬件的使用, CUDA相关讨论, 中文房间现象, 微信群组交流

Link mentioned: 中文房间 - 维基百科,自由的百科全书: no description found


GPU MODE ▷ #general (1 messages):

Submissions milestone


GPU MODE ▷ #submissions (206 messages🔥🔥):

Leaderboard submissions, Benchmark tests, Submission script header mismatches


GPU MODE ▷ #ppc (10 messages🔥):

int8 matmul performance, loop reordering, course insights, personal intuition


GPU MODE ▷ #feature-requests-and-bugs (6 messages):

Custom Kernel Preprocessing, Username Visibility in Bot Interactions, Matmul Efficiency Discussion


HuggingFace ▷ #general (132 messages🔥🔥):

Licensing for Community Bots, AI Voice Changer Experience, Critique of New AI Models, Development of SmolAgents, Benchmarking AI Models

Links mentioned:


HuggingFace ▷ #today-im-learning (4 messages):

Neuralink Image Analysis, CursorOp Interface Changes, Difference between F2 and F12, Building Basic Agents


HuggingFace ▷ #i-made-this (8 messages🔥):

Private Benchmark for LLMs, New Face Similarity Questionnaire, PyTorch Library for 360° Images, Phi 4 Models

Links mentioned:


HuggingFace ▷ #reading-group (2 messages):

Benchmarks for Language Models, Challenging Hypotheses, REFUTE Framework, Counterexamples in Algorithms, LLMs as Retrieval Engines

Link mentioned: Paper page - Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation: no description found


HuggingFace ▷ #computer-vision (2 messages):

Today's Session, Participation in Future Sessions


HuggingFace ▷ #gradio-announcements (1 messages):

FastRTC Discussions, Announcements


HuggingFace ▷ #smol-course (9 messages🔥):

Discount for Inference Requests, Iframe Size for Quiz Feedback, Agent Feedback Issues in Quiz 2.1, Clarification on SFT Trainer Loss, HfApiModel vs LiteLLMModel Confusion


HuggingFace ▷ #agents-course (129 messages🔥🔥):

Course Enrollment Introductions, Unit 1 Quiz Issues, Agent Implementation Challenges, Feedback on Unit 2 Experience, Error Handling in Coding Examples

Links mentioned:


Perplexity AI ▷ #general (264 messages🔥🔥):

Perplexity Pro subscription, GPT-4.5 release, Voice Mode experiences, AI model comparisons, Support issues

Links mentioned:


Perplexity AI ▷ #sharing (17 messages🔥):

AI Tool for Diagnosing Diseases, NVIDIA's Financial Results, Building Construction Techniques, Ransomware Group Exposed, Deep Sea Research

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (4 messages):

Perplexity Pro API credits, Obsidian Web Clipper Configuration, API Integration Troubleshooting, Refund Policy for API Charges


Stability.ai (Stable Diffusion) ▷ #announcements (1 messages):

Website Redesign Contest, Stable Diffusion 3.5, Submission Guidelines, Participant Eligibility, Contest Deadline


Stability.ai (Stable Diffusion) ▷ #general-chat (92 messages🔥🔥):

ControlNet models for character consistency, LLMs with real-time data referencing, Cash prize competitions and legalities, Technical support in AI art generation, Animatediff compatibility with Forge


Eleuther ▷ #general (8 messages🔥):

HF Repo Deprecation, Best RAG Tools for Personal Use, Guide on Pretraining and SFT, LLM Prompting Techniques


Eleuther ▷ #research (36 messages🔥):

MixMin for Data Mixture Optimization, Gemini 2.0 Flash Thinking Evaluation, SWE-RL for Software Engineering, Internal Benchmarking Challenges

Links mentioned:


Eleuther ▷ #interpretability-general (22 messages🔥):

Jacobian Sparse Autoencoders, SmolLM2 Checkpoints, Mechanistic Interpretability Resources, Weight Tracing in Pretraining, Open Problems in Mechanistic Interpretability

Links mentioned:


Eleuther ▷ #lm-thunderdome (17 messages🔥):

ARC Evaluation Framework, Comparison of QA evaluation, Usage of Chat Templates, Command for GPQA Evaluation, Data Parallelism in Model Training

Links mentioned:


Yannick Kilcher ▷ #general (58 messages🔥🔥):

GPT-4.5 Release, AI Competition Landscape, Teaching Positions & Employment, OpenAI's Position in AI Market, Model Confirmation and Specs

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (7 messages):

Hash collision problem, KV removal strategy, Twitch stream link

Link mentioned: ClaudePlaysPokemon - Twitch: Claude Plays Pokemon - Debut Stream


Yannick Kilcher ▷ #ml-news (15 messages🔥):

Alexa Plus rollout, GPT-4.5 announcement critiques, Open Infrastructure Index, Live stream reactions, Model benchmarking concerns

Links mentioned:


Cohere ▷ #discussions (44 messages🔥):

Cohere Models SDK, Auto Captions AI APIs, Release of New LLMs, Command R+ Update, Benchmarking Arabic Models

Links mentioned:


Cohere ▷ #announcements (1 messages):

Command R7B Arabic, Cohere's multilingual AI, Open weights release, C4AI Command models

Links mentioned:


Cohere ▷ #cmd-r-bot (3 messages):

World without coffee, Differential Transformers


Cohere ▷ #projects (9 messages🔥):

Auto Caption APIs, Adobe Premiere Transcription


LlamaIndex ▷ #blog (2 messages):

AI in medical fields, LlamaExtract, Data extraction from unstructured documents


LlamaIndex ▷ #general (48 messages🔥):

Data Leak in LlamaParse 0.6.2, Using Elasticsearch with Custom Schemas, Integration of Searxng as a Metasearch Engine, Issues with LlamaExtract Methods, Custom Exception Handling in AgentWorkflow

Links mentioned:


DSPy ▷ #show-and-tell (1 messages):

Portkey AI, Prompt Engineering Studio, Live Workshop

Link mentioned: Demo: Prompt Engineering Studio · Zoom · Luma: Join us for an exclusive first look at Portkey's Prompt Engineering Studio - the most comprehensive toolkit for building, testing, and deploying AI prompts at…


DSPy ▷ #general (37 messages🔥):

New Assertions and Token Consumption, Import Errors in DSPy, Guideline Assessment Integration, Feedback on Refine API, Community Engagement for DSPy Enhancements

Links mentioned:


Torchtune ▷ #general (1 messages):

yamashi: Gpt4.5 available on azure


Torchtune ▷ #dev (26 messages🔥):

CI for PR #2419, Activation Offloading vs. Checkpointing, Distributed Torch Code and Model Loading, Integration Test for DPO

Links mentioned:


Torchtune ▷ #papers (10 messages🔥):

DualPipe GitHub project, Federated Learning in hospitals

Link mentioned: GitHub - deepseek-ai/DualPipe: A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.: A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. - deepseek-ai/DualPipe


Notebook LM ▷ #use-cases (2 messages):

User greetings


Notebook LM ▷ #general (29 messages🔥):

NotebookLM Features, Sharing Notebooks, Voice Scraping Concerns, Service Availability Issues

Links mentioned:


Modular (Mojo 🔥) ▷ #general (5 messages):

Repo structure simplification, Mojo language prioritization, Chris' blog post series

Link mentioned: Upcoming changes to our GitHub repositories: Tomorrow (February 27), we’re streamlining our GitHub repositories! The max repo is merging into the mojo repo, bringing everything under one roof. A new subdirectory will house the Mojo standard libr...


Modular (Mojo 🔥) ▷ #mojo (25 messages🔥):

MLIR dialects, HyperLogLog Implementation, Mojo runtime, Understanding unions, Mojo on Mac OS

Link mentioned: GitHub - axiomhq/mojo-hyperloglog: Contribute to axiomhq/mojo-hyperloglog development by creating an account on GitHub.


MCP (Glama) ▷ #general (18 messages🔥):

MCP in Production, Claude Code Issues, GitHub Application for MCP, MCP Server Resource Challenges, Requesting MCP Server in Lang Chain

Links mentioned:


MCP (Glama) ▷ #showcase (5 messages):

MCP Redmine, Ableton Voice Control Integration, TinyLM Client-side Inference

Links mentioned:


Nomic.ai (GPT4All) ▷ #general (18 messages🔥):

Live mode for voice recognition, Chat template usage with GGUF models, Obadooga installation process, Internet speed concerns

Link mentioned: GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models with support for multiple inference backends.: A Gradio web UI for Large Language Models with support for multiple inference backends. - oobabooga/text-generation-webui


tinygrad (George Hotz) ▷ #general (12 messages🔥):

GROUP OptOps performance, Arange test issues, BEAM search adjustments, LLVMLite speed concerns, Kernel optimization strategies

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (1 messages):

Self-Directed Learning, Code Exploration, Questions about Tinygrad


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

Interest in Research Group, Direct Messaging for Information, Discord Server Announcement


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (1 messages):

Research track subgroups, Predictive decision making, Long term memory in agents, Lecture discussions


MLOps @Chipro ▷ #general-ml (1 messages):

tinylm library, OpenAI-compatible API, Client-side inference, WebGPU acceleration, Text generation features

Link mentioned: tinylm - Run Models Locally with WebGPU: no description found



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}