Frozen AI News archive

Reasoning Models are Near-Superhuman Coders (OpenAI IOI, Nvidia Kernels)

**o3 model** achieved a **gold medal at the 2024 IOI** and ranks in the **99.8 percentile on Codeforces**, outperforming most humans with reinforcement learning (RL) methods proving superior to inductive bias approaches. **Nvidia's DeepSeek-R1** autonomously generates GPU kernels that surpass some expert-engineered kernels, showcasing simple yet effective AI-driven optimization. **OpenAI** updated **o1 and o3-mini** models to support file and image uploads in ChatGPT and released **DeepResearch**, a powerful research assistant based on the **o3 model with RL** for deep chain-of-thought reasoning. **Ollama** introduced **OpenThinker models** fine-tuned from **Qwen2.5**, outperforming some DeepSeek-R1 distillation models. **ElevenLabs** grew into a $3.3 billion company specializing in AI voice synthesis without open-sourcing their technology. Research highlights include **Sakana AI Labs' TAID knowledge distillation method** receiving a Spotlight at **ICLR 2025**, and **Apple's work on scaling laws for mixture-of-experts (MoEs)**. The importance of open-source AI for scientific discovery was also emphasized.

Canonical issue URL

AI News for 2/12/2025-2/13/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (211 channels, and 5290 messages) for you. Estimated reading time saved (at 200wpm): 554 minutes. You can now tag @smol_ai for AINews discussions!

This is a rollup of two distinct news items with nevertheless the same theme:

In the Nvidia case, the solution was also extremely simple, causing much consternation. image.png


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

AI Tools and Resources

AI Research Advances

AI Infrastructure and Efficiency

AI Security and Safety

AI Governance and Policy

Memes/Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Google's FNet: Potential for Improved LLM Efficiency via Fourier Transforms

Theme 2. DIY High-Performance Servers for 70B LLMs: Strategies and Cost

Theme 3. Gemini2.0's Dominance in OCR Benchmarking and Context Handling

Theme 4. Innovative Architectural Insights from DeepSeek: Expert Mixtures and Token Predictions

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. OpenAI merges o3 into unified GPT-5

Theme 2. Anthropic and OpenAI enhance reasoning models


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Exp

Theme 1. Reasoning LLM Models - Trends in New releases

Theme 2. Tiny-yet-Mighty LLMs and Tooling Improvements

Theme 3. Perplexity Finance Dashboard and Analysis of AI Models

Theme 4. Challenges and Creative Solutions

Theme 5. Data, Copyrights, and Declarations


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


HuggingFace Discord


OpenAI Discord


Cursor IDE Discord


Nous Research AI Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


Codeium (Windsurf) Discord


LM Studio Discord


GPU MODE Discord


Interconnects (Nathan Lambert) Discord


Eleuther Discord


Stability.ai (Stable Diffusion) Discord


Latent Space Discord


Yannick Kilcher Discord


LlamaIndex Discord


Notebook LM Discord


Modular (Mojo 🔥) Discord


MCP (Glama) Discord


tinygrad (George Hotz) Discord


LLM Agents (Berkeley MOOC) Discord


Torchtune Discord


DSPy Discord


Nomic.ai (GPT4All) Discord


Cohere Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (956 messages🔥🔥🔥):

GRPO updates in Unsloth, VRAM requirements for models, Dynamic quantization with DeepSeek, Merging models and its implications, Fine-tuning strategies for LLMs

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

Unsloth reintroduction, Wendel's AI shoutouts, Deepseek's release

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (108 messages🔥🔥):

Llama 3.2 Issues, GRPO Training Challenges, Structured Data Models, Using Unsloth with Local Models, Model Configuration and Installation

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (2 messages):

Rombo-LLM-V3.0-Qwen-32b, DeepSeek-R1 Performance, Llama 3.1B Fine-tuning, Resources for Training Reasoning Models

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (6 messages):

Transformer performance on tabular data, Fine-tuning Mistral model, Inference instructions for LoRA checkpoint, Reasoning agents development, Lavender method for VLMs

Links mentioned:


HuggingFace ▷ #general (51 messages🔥):

Agent Templates Issues, Embedding Models and Performance, Deep Reinforcement Learning Course, LLama Spam Behavior, ViT Projection Dimension

Links mentioned:


HuggingFace ▷ #today-im-learning (10 messages🔥):

Overlapping Communication in Tensor Parallelism, Agents Course, Fuzzy Clustering, Use of Special Tokens, Importance of Repetition


HuggingFace ▷ #i-made-this (14 messages🔥):

QR Code DOOM, AI Model Training Assistant, New LLM Releases, Joker Joke Generator, Deep Researcher System

Links mentioned:


HuggingFace ▷ #reading-group (10 messages🔥):

Bingbin Liu Presentation, Technical Difficulties, Session Recording

Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ...


HuggingFace ▷ #computer-vision (1 messages):

Canny edge filters, Sobel filters, ControlNet with diffusion model


HuggingFace ▷ #NLP (8 messages🔥):

Pretrained Model Behavior, Tokenization of Tool Messages, Fine-tuning with LoRA, End Token Generation Issues, Training Techniques


HuggingFace ▷ #smol-course (18 messages🔥):

Agent Course Support, Endpoint Changes, Model Name Updates, Study Group Inquiries, Testing New Tools


HuggingFace ▷ #agents-course (717 messages🔥🔥🔥):

Hugging Face Agents Course, Discord verification issues, Learning groups and collaboration, Model access and deployment, Course completion and certificates

Links mentioned:


HuggingFace ▷ #open-r1 (5 messages):

DeepSeek AI-HPC, Granite 3.2 MoE, GPT-3.5 Data Distillation

Links mentioned:


OpenAI ▷ #annnouncements (3 messages):

Deep Research Access, File & Image Uploads in ChatGPT, Model Spec Update


OpenAI ▷ #ai-discussions (347 messages🔥🔥):

OpenAI future ownership, AI model capabilities, Fictional violence in AI, Current AI tools and platforms, Comparison of AI models

Links mentioned:


OpenAI ▷ #gpt-4-discussions (12 messages🔥):

Custom GPT Models, Hiring Experts, Limits of Free Plan


OpenAI ▷ #prompt-engineering (16 messages🔥):

Function Calling Issues, Prompt Sharing, Using CoT and ToT, Error Interpretation, Prompt Engineering Discussions


OpenAI ▷ #api-discussions (16 messages🔥):

Function calling issues, Prompt sharing practices, Using CoT and ToT, ChatGPT versus playground, Interpreting prompts


Cursor IDE ▷ #general (392 messages🔥🔥):

Cursor IDE Features, OpenAI o3-mini vs. Claude, Anthropic's Hybrid AI Model, Tool Calling and Coding, MCP Server Utilization

Links mentioned:


Nous Research AI ▷ #announcements (2 messages):

DeepHermes-3 Preview, Long chain of thought reasoning, LLM advancements, Hugging Face Model Links

Links mentioned:


Nous Research AI ▷ #general (268 messages🔥🔥):

DeepHermes-3 model preview, Reasoning capabilities, Mobile performance limitations, Comparisons with other models, Accessibility of hardware for running models

Links mentioned:


Nous Research AI ▷ #ask-about-llms (2 messages):

SFT on Llama-3B-Instruct, Loss of Base Model Performance, Domain-Specific Challenges


Nous Research AI ▷ #research-papers (3 messages):

Nvidia blog post on GPU kernels, LLM report papers, State of the art methods in LLM

Link mentioned: Tweet from Anne Ouyang (@anneouyang): New blog post from Nvidia: LLM-generated GPU kernels showing speedups over FlexAttention and achieving 100% numerical correctness on 🌽KernelBench Level 1


Nous Research AI ▷ #interesting-links (1 messages):

US AI Safety Declaration, International AI Cooperation, Concerns about Authoritarian Regimes

Link mentioned: US and UK refuse to sign AI safety declaration at summit: US stance is “180-degree turnaround” from Biden administration.


Nous Research AI ▷ #research-papers (3 messages):

Nvidia LLM-generated GPU Kernels, Recent LLM Report Papers, r1 kimik and synthlab papers

Link mentioned: Tweet from Anne Ouyang (@anneouyang): New blog post from Nvidia: LLM-generated GPU kernels showing speedups over FlexAttention and achieving 100% numerical correctness on 🌽KernelBench Level 1


OpenRouter (Alex Atallah) ▷ #announcements (11 messages🔥):

Groq DeepSeek R1 70B launch, New sorting preferences in OpenRouter, Update to usage field in API, Token count comparisons, Discussion on model ranking consistency

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (257 messages🔥🔥):

OpenAI's o3-mini functionality, Issues with Deepseek R1, Self-moderated OpenAI endpoints, Google's rate limit errors, Usage of AI models for YouTube content creation

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (1 messages):

Feature feedback


Perplexity AI ▷ #general (230 messages🔥🔥):

Perplexity Finance Dashboard, AI Model Performance, Customer Support Issues, Referring Links and Discounts, Usage of AI in Technology

Links mentioned:


Perplexity AI ▷ #sharing (21 messages🔥):

EU AI Investment, Llama Model, DND Campaign Features, AI's Performance on Integer Queries, OpenAI Bid Situation


Perplexity AI ▷ #pplx-api (11 messages🔥):

API 500 Error, Beta Testing for Sonar API on Cerebras


Codeium (Windsurf) ▷ #announcements (2 messages):

AI Engineering Summit Tickets, Windsurf Wave 3 Features, Model Context Protocol, Customizable App Icons, Turbo Mode

Links mentioned:


Codeium (Windsurf) ▷ #discussion (13 messages🔥):

Codeium Release 1.36.1, Troubleshooting Issues, Internship Opportunities, Upcoming Announcements


Codeium (Windsurf) ▷ #windsurf (244 messages🔥🔥):

AI-generated decision concerns, Windsurf chat issues, MCP server functionality, Cascade performance problems, Feature requests and suggestions

Links mentioned:


LM Studio ▷ #general (115 messages🔥🔥):

Qwen-2.5 VL Model Performance Issues, Model Uploading and Compatibility, Using Templates in LM Studio, GPU Usage and Specs, LM Studio Errors and Troubleshooting

Links mentioned:


LM Studio ▷ #hardware-discussion (120 messages🔥🔥):

LLM Inference Performance, GPU Comparisons, AI Hardware Developments, Intel vs AMD CPUs, Gaming vs Inference

Links mentioned:


GPU MODE ▷ #general (1 messages):

shindeirou: does anybody know at what toolkit version nvjet was introduced to cublas?


GPU MODE ▷ #triton (8 messages🔥):

PyTorch Profiler Tracing, Fused MM Activation with Triton, Triton GEMM Performance, Autotuning in Triton


GPU MODE ▷ #cuda (18 messages🔥):

Blackwell GPU Tensor Memory, CUTLASS CUTE and SGEMM, NCCL Issues with Blackwell, Tensor Memory Programmer Management, Accessing GB200 GPU Resources

Link mentioned: Tweet from Lambda (@LambdaAPI): All we know is we're good for our NVIDIA HGX B200s 🙂


GPU MODE ▷ #torch (7 messages):

SymPy for Backward Pass, Torch Compile for Optimization, Fast Hadamard Transform in Quantized Attention, Gradient Formula Simplification, Issues with gradgradcheck()

Links mentioned:


GPU MODE ▷ #cool-links (1 messages):

iron_bound: https://arxiv.org/abs/2502.07202


GPU MODE ▷ #jobs (8 messages🔥):

D-Matrix hiring efforts, Kernel programming talk, Architecture discussion, Performance projections, Programming model development

Links mentioned:


GPU MODE ▷ #beginner (30 messages🔥):

CUDA Code Structure, Error Handling Best Practices, Memory Cleanup in CUDA, Kernel Launch Indexing, CUDA Development in C vs C++

Links mentioned:


GPU MODE ▷ #pmpp-book (8 messages🔥):

CUDA memory model confusion, Errors in table for tiled matrix multiplication, Clarification on tile sizes, Typos in printed materials

Link mentioned: CUDA memory model: why acquire fence is not needed to prevent load-load reordering?: I am reading the book "Programming Massively Parallel Processors" and noticed the below code snippets to achieve "domino-style" scan: if (threadIdx.x == 0) {&#x...


GPU MODE ▷ #torchao (2 messages):

Dynamic Quantization, Issue Resolution


GPU MODE ▷ #sequence-parallel (1 messages):

shindeirou: sorry dude never saw that message. It was excalidraw + PP


GPU MODE ▷ #off-topic (6 messages):

Inference-time scaling, AI and compute efficiency, Documentation of hardware, Conspiracy theories in AI, Personal coding documentation

Links mentioned:


GPU MODE ▷ #liger-kernel (3 messages):

FSDP, Liger Kernel, User Defined Kernels


GPU MODE ▷ #self-promotion (7 messages):

CUDA Kernel Optimizations, Performance Comparisons with PyTorch, cuBLAS vs. CUDA, Matrix Multiplication Techniques

Links mentioned:


GPU MODE ▷ #🍿 (70 messages🔥🔥):

DeepSeek-R1 and Inference-Time Scaling, KernelBench Benchmark Performance, GPU Kernel Optimization Challenges, Project Popcorn and Open Collaboration, Using LLMs for Systems Programming

Link mentioned: Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling | NVIDIA Technical Blog: As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is emerging. Also known as AI reasoning ...


GPU MODE ▷ #reasoning-gym (47 messages🔥):

Graph Coloring Problems, Reasoning-Gym Dataset Evaluations, Futoshiki Puzzle Dataset, Game of Life Outputs, Standardization of Reporting Scores

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (133 messages🔥🔥):

GRPO vs PPO in Tulu models, Anthropic's upcoming Claude model, DeepHermes-3 Preview release, EnigmaEval reasoning challenges, Jailbreaking challenges results

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (5 messages):

notebookLM performance, GPT-5 model interface


Interconnects (Nathan Lambert) ▷ #ml-drama (2 messages):

DH3 Evaluation Metrics, ImitationLearn Company Legitimacy

Link mentioned: Tweet from kalomaze (@kalomaze): dh3 notes1. they only show these two specific evals for the "reasoning on"; the "reasoning off" chart is the only one showing all metrics2. they don't compare to the official 8b di...


Interconnects (Nathan Lambert) ▷ #random (36 messages🔥):

Censored Model Discussions, OpenThinker-32B Model Release, Reasoning Token Scaling, Chatbot Prompt Guidelines, Community Commentary on RL

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (4 messages):

DeepSeek Announcement, OpenAI's Roadmap Update, OLMo GitHub Issue, AI Security Reviewers, Rust's Future Value Proposition

Links mentioned:


Interconnects (Nathan Lambert) ▷ #reads (20 messages🔥):

Dwarkesh, Noam Shazeer, Jeff Dean, Podcast Interviews, Science History

Link mentioned: noam shazeer - Google Search: no description found


Interconnects (Nathan Lambert) ▷ #posts (5 messages):

Aged beautifully, SnailBot


Eleuther ▷ #general (14 messages🔥):

Dataset perplexity evaluation, Post-training datasets, Online courses in AI and tech, Collaboration opportunities at Eleuther AI


Eleuther ▷ #research (126 messages🔥🔥):

PPO-Clip with Alternative Models, Memory Mechanisms in Models, Forgetting Transformer, Evaluation of Long Context Models, Temporal Causality in Attention

Links mentioned:


Eleuther ▷ #interpretability-general (2 messages):

Citing Delphi, Citing Sparsify, BibTeX entries for papers, GitHub citations


Stability.ai (Stable Diffusion) ▷ #general-chat (92 messages🔥🔥):

Stable Diffusion Saving Issues, Switching to Linux with ComfyUI, Model Recommendations and Performance, AI Character Design Consistency, Upwork Account Borrowing

Links mentioned:


Latent Space ▷ #ai-general-chat (81 messages🔥🔥):

AI Agents, OpenAI Roadmap Updates, Apple Product Announcements, DeepHermes 3 Model, Meta's Automated Compliance Hardening Tool

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

swyxio: new pod drop! https://x.com/latentspacepod/status/1890101440615453025


Yannick Kilcher ▷ #general (50 messages🔥):

Accessing Research Papers, TinyStories Pretraining, Pretraining Foundation Models, Intermediary Logits in RL, Architecture and Optimization Challenges

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (6 messages):

Forward citation of language models, Monte Carlo Tree Diffusion, Challenges in balanced language datasets, Paper discussion scheduling

Link mentioned: Monte Carlo Tree Diffusion for System 2 Planning: Diffusion models have recently emerged as a powerful tool for planning. However, unlike Monte Carlo Tree Search (MCTS)-whose performance naturally improves with additional test-time computation (TTC),...


Yannick Kilcher ▷ #agents (5 messages):

Smolagents by Hugging Face, New AI Model without Tokens, Novel Language Model Architecture

Links mentioned:


Yannick Kilcher ▷ #ml-news (20 messages🔥):

Elon Musk's Grok 3, Thomson Reuters AI Copyright Case, OpenAI Roadmap Update, Literature Review Tool, Pre-trained Language Models Release Practices

Links mentioned:


LlamaIndex ▷ #blog (3 messages):

LlamaIndex Open Source Engineer Position, Nomic AI Embedding Model, Google Cloud Integrations


LlamaIndex ▷ #general (65 messages🔥🔥):

Query Engine Tool with Metadata, Exhaustive RAG Search Techniques, Vector Database Preferences, LlamaIndex Configuration for Unicode, AI Agents and Workflow Implementation

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

pier1337: What are some good reasons to finetune a model?


Notebook LM ▷ #use-cases (10 messages🔥):

NotebookLM's podcasting capabilities, Monetizing content with AI, AI-generated podcast hosts, Comparing AI sources, Translating content into audio

Links mentioned:


Notebook LM ▷ #general (49 messages🔥):

NotebookLM features, Daily usage limits, Language support, Sharing ideas in community, Audio generation issues

Links mentioned:


Modular (Mojo 🔥) ▷ #general (1 messages):

eggsquad: new Modular job postings 👀


Modular (Mojo 🔥) ▷ #mojo (40 messages🔥):

Mojo sum types, ECS vs Component Architecture, Mojo Function Wrapping, Memory Management in Mojo, New Release v25.1

Link mentioned: no title found: no description found


Modular (Mojo 🔥) ▷ #max (11 messages🔥):

MAX CUDA usage, NVIDIA backend bugs, Mojo API tensor types, Forum for help

Link mentioned: Modular: Build the future of AI with us and learn about MAX, Mojo, and Magic.


MCP (Glama) ▷ #general (50 messages🔥):

MCP Development, OpenAI Models in MCP, Usage Limits for Claude Desktop, Glama Gateway Comparison, Open WebUI Features

Links mentioned:


tinygrad (George Hotz) ▷ #general (16 messages🔥):

Windows CI Issues, DeepSeek-R1 Model Experiment, Graph Rewrite Challenges, AI and Code Submission Etiquette

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (2 messages):

tinygrad vs PyTorch, Performance & Cost Efficiency, Understanding Hardware


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

LLM Agents MOOC Hackathon, Participation statistics, Winning teams, Top represented countries, Top represented companies

Link mentioned: Tweet from Dawn Song (@dawnsongtweets): 🎉 Excited to announce the winning teams of LLM Agents MOOC Hackathon! We’re thrilled by the amazing participation and enthusiasm from the global AI community:🌍 ~3,000 participants from 127 countries...


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

Spring 2025 MOOC, Advanced LLM Agents, Live Sessions, AI for Mathematics

Link mentioned: Tweet from Dawn Song (@dawnsongtweets): Really excited to announce our Advanced LLM Agents MOOC (Spring 2025)!Building on the success of our LLM Agents MOOC from Fall 2024 (15K+ registered learners, ~9K Discord members, 200K+ lecture views ...


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (10 messages🔥):

Hackathon Participation, Certificate Resending Requests, Prompt Evaluation, Ninja Certification, Certificate Declaration Form


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (2 messages):

Updates on Release Details, Guidance for Newcomers in AI/ML


LLM Agents (Berkeley MOOC) ▷ #mooc-readings-discussion (1 messages):

tarande57: we'll release details soon! thank you for your patience!


Torchtune ▷ #general (9 messages🔥):

Distributed Inference with Torchtune, Torchtune Docker Image, Using vLLM for Model Loading

Links mentioned:


Torchtune ▷ #dev (5 messages):

Checkpointing Branch, Recipe State Functionality, Documentation Improvement, Team Collaboration


DSPy ▷ #general (10 messages🔥):

Inference-Time Scaling, LangChain vs DSPy, DSPy 2.6 Changes

Link mentioned: Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling | NVIDIA Technical Blog: As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is emerging. Also known as AI reasoning ...


Nomic.ai (GPT4All) ▷ #general (10 messages🔥):

GPT4All v3.9.0 and Deepseek R1 Integration, LocalDocs Functionality and Limitations, NOIMC v2 Release and Implementation, Nomic's Multilingual MoE Text Embeddings, Turning English Prompts into Code

Link mentioned: nomic-ai/nomic-embed-text-v2-moe · Hugging Face: no description found


Cohere ▷ #api-discussions (2 messages):

Rerank 3.5 behavior, Cohere with Salesforce BYOLLM




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}