Frozen AI News archive

not much happened today

**Smolagents** library by **Huggingface** continues trending. **ChatGPT-4o** latest version `chatgpt-40-latest-20250129` released. **DeepSeek R1 671B** sets speed record at **198 t/s**, fastest reasoning model, recommended with specific prompt settings. **Perplexity Deep Research** outperforms models like **Gemini Thinking**, **o3-mini**, and **DeepSeek-R1** on **Humanity's Last Exam** benchmark with **21.1%** score and **93.9%** accuracy on **SimpleQA**. **ChatGPT-4o** ranks #1 on Arena leaderboard in multiple categories except math. **OpenAI's o3 model** powers Deep Research tool for ChatGPT Pro users. **Gemini 2 Flash** and **Qwen 2.5** models support LLMGrading verifier. **Qwen 2.5** models added to PocketPal app. **MLX** shows small LLMs like Qwen 0.5B generate tokens at high speed on M4 Max and iPhone 16 Pro. **Gemini Flash 2.0** leads new AI agent leaderboard. **DeepSeek R1** is most liked on Hugging Face with over 10 million downloads.

Canonical issue URL

AI News for 2/13/2025-2/14/2025. We checked 7 subreddits, 433 Twitters and 29 Discords (212 channels, and 4956 messages) for you. Estimated reading time saved (at 200wpm): 545 minutes. You can now tag @smol_ai for AINews discussions!

There's a new ChatGPT-4o version in town: chatgpt-40-latest-20250129

And in the meantime, Huggingface's smol agents library continues to trend, so you can check out this brief discussion.

https://www.youtube.com/watch?v=QytYcjTkkQU


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

AI Models, Benchmarks, and Performance

Open Source AI and Community

AI Applications and Use Cases

AI Research and Techniques

AI Industry and Business

Humor and Miscellaneous


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek's Influence: Open-Source and Deployment Insights

Theme 2. Evaluating Mac Studio for Local LLM Deployment

Theme 3. Backdoor Vulnerabilities in AI Models: BadSeek as a Case Study

Theme 4. Scaling AI with DeepSeek R-1: Live Streaming Insights

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. Perplexity Launches Free Deep Research

Theme 2. MCP (Model Context Protocol) Explained and Impact


AI Discord Recap

A summary of Summaries of Summaries by o1-preview-2024-09-12

Theme 1. New AI Model Releases and Innovations

Theme 2. User Frustrations and Usability Woes with AI Tools

Theme 3. Challenges in AI Model Fine-Tuning and Performance

Theme 4. AI Hardware and Infrastructure Developments

Theme 5. Ethical and Security Concerns in AI


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Codeium (Windsurf) Discord


Perplexity AI Discord


HuggingFace Discord


Cursor IDE Discord


LM Studio Discord


Nous Research AI Discord


Eleuther Discord


GPU MODE Discord


OpenRouter (Alex Atallah) Discord


OpenAI Discord


Stability.ai (Stable Diffusion) Discord


Interconnects (Nathan Lambert) Discord


Notebook LM Discord


Latent Space Discord


LlamaIndex Discord


MCP (Glama) Discord


Yannick Kilcher Discord


tinygrad (George Hotz) Discord


DSPy Discord


Modular (Mojo 🔥) Discord


Nomic.ai (GPT4All) Discord


Torchtune Discord


LLM Agents (Berkeley MOOC) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (621 messages🔥🔥🔥):

LoRA Fine-Tuning, Model Training and RAG, PDF Data Extraction, AI Hardware Support, Model Evaluation and Performance

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (9 messages🔥):

Wendel's Unsloth shoutouts, RAG Implementation, Deepseek's AI release

Link mentioned: Embrace the Coming AI Revolution with Safe Local AI!: Deepseek's release has shaken up the AI world, and we're on the precipice of the AI Industrial Revolution! Wendell gives you the low down on how to take that...


Unsloth AI (Daniel Han) ▷ #help (244 messages🔥🔥):

DeepSeek R1 Performance, Training with LORA and RAG, GRPO Reward Function Issues, Model Compatibility with TPU, HPC Cluster Training Errors

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (9 messages🔥):

RLHF Reward Modeling, Tülu 3 GRPO, Multi GPU Support in Unsloth, OLMoE Improvements, New Optimizer

Links mentioned:


Codeium (Windsurf) ▷ #announcements (2 messages):

AI Engineering Summit Tickets, Windsurf Wave 3 Features, Model Context Protocol, Customizable App Icons, Turbo Mode Enhancements

Links mentioned:


Codeium (Windsurf) ▷ #discussion (31 messages🔥):

Announcement speculation, Codeium extension behaviors, Windsurf frustrations, Feature requests for the extension, User support for Codeium


Codeium (Windsurf) ▷ #windsurf (622 messages🔥🔥🔥):

Cascade Base issues, MCP Server Configuration, Windsurf Performance, User Experience with WindSurf, Codeium Support Feedback

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):

Perplexity Deep Research, Deep Research features, Free queries, App availability, Research capabilities

Link mentioned: no title found: no description found


Perplexity AI ▷ #general (601 messages🔥🔥🔥):

Perplexity Deep Research, AI Model Performance, Feedback on Subscription Plans, User Experience with Models, Issues with Deep Research Search

Links mentioned:


Perplexity AI ▷ #sharing (18 messages🔥):

Daily Omega-3 Dose, Inflation Trends, Musk's Bid on OpenAI, ChatGPT Energy Consumption, N8N JavaScript Usage

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (5 messages):

Sonar API Beta Testing, Aider and DeepSeek V3 Integration, Cheap Coding Workflow, Business Use Case for Perplexity API, Deep Research API Feature


HuggingFace ▷ #general (37 messages🔥):

Embedding Models, Vision Transformer Dimension, Open Deep Research Demo, Speech to Text Using Deepgram, User Interface Concerns

Link mentioned: Hmmm Thinking GIF - Hmmm Thinking Batman - Discover & Share GIFs: Click to view the GIF


HuggingFace ▷ #today-im-learning (5 messages):

Neuralink Updates, Chat Templates and Transformers, QT Material and Layouts, Agent's Unit 1, Dataset-Tools Development


HuggingFace ▷ #i-made-this (7 messages):

Jokes Generator API, SciNewsBot for BlueSky, Browser Engines and WASM

Links mentioned:


HuggingFace ▷ #reading-group (10 messages🔥):

Technical difficulties, Zoom meeting, Session recording, Presentation feedback

Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ...


HuggingFace ▷ #computer-vision (1 messages):

Canny Edge Detection, Sobel Filters, Machine Learning in Preprocessing, ControlNet with Diffusion Models


HuggingFace ▷ #NLP (10 messages🔥):

Qwen Model Performance, Fine-tuning Issues, End Token Generation, Quality of Training Data, Chat Templates Knowledge

Links mentioned:


HuggingFace ▷ #smol-course (2 messages):

HF_TOKEN definition, Model changes


HuggingFace ▷ #agents-course (284 messages🔥🔥):

Course Introduction, Certificate Issues, Collaborative Learning, Agent Development, LLM Exploration

Links mentioned:


HuggingFace ▷ #open-r1 (8 messages🔥):

DeepSeek V3, Granite 3.2 MoE, ESFT Paper Review, Community Call Discussions

Links mentioned:


Cursor IDE ▷ #general (333 messages🔥🔥):

Cursor IDE usability, AI model performance, MCP server usage, Subscription issues, Tool integration struggles

Links mentioned:


LM Studio ▷ #general (154 messages🔥🔥):

Error Handling in LM Studio, Performance Comparison of Models, Headless Mode in LM Studio, Speculative Decoding Support, Model Architecture Changes

Links mentioned:


LM Studio ▷ #hardware-discussion (172 messages🔥🔥):

AMD ROCm promotion, NVIDIA RTX 3500 Ada, 2023 AI hardware market, Stability issues with new hardware, VRAM performance with multiple GPUs

Links mentioned:


Nous Research AI ▷ #announcements (2 messages):

DeepHermes-3 Preview, Long Chain of Thought Reasoning, LLM Model Improvements, Community Feedback on Reasoning Models

Links mentioned:


Nous Research AI ▷ #general (217 messages🔥🔥):

DeepHermes-3 Preview, Deepfake Technology Discussions, Training and Fine-tuning Models, Model Performance Comparisons, Technical Issues with Models

Links mentioned:


Nous Research AI ▷ #ask-about-llms (13 messages🔥):

SFT on Llama-3B-Instruct, Fine-tuning local AI, Training costs of language models, 1.5-Pints technical report

Link mentioned: 1.5-Pints Technical Report: Pretraining in Days, Not Months – Your Language Model Thrives on Quality Data: no description found


Nous Research AI ▷ #research-papers (3 messages):

LLM report papers, Ultra-sparse memory networks, Kimik and Synthlab papers, Inference speed in LLMs

Link mentioned: Ultra-Sparse Memory Network: It is widely acknowledged that the performance of Transformer models is logarithmically related to their number of parameters and computational complexity. While approaches like Mixture of Experts...


Nous Research AI ▷ #research-papers (3 messages):

LLM report papers, Ultra-sparse memory network, Mixture of Experts, Scaling laws

Link mentioned: Ultra-Sparse Memory Network: It is widely acknowledged that the performance of Transformer models is logarithmically related to their number of parameters and computational complexity. While approaches like Mixture of Experts...


Eleuther ▷ #general (11 messages🔥):

Eluther AI Research Contributions, Machine Learning and CS Projects, Identifying People in an Image

Link mentioned: Tweet from ⚠️ Igor Brigadir 🇺🇦 (@IgorBrigadir): Everyone in this chart:Top Left: @fchollet @raphaelmilliere @GaryMarcus @tyrell_turing @ylecun @rohinmshahTop Right: @sama @soniajoseph_ @ID_AA_Carmack @tszzl @demishassabis @michael_nielsen @sea_snel...


Eleuther ▷ #research (208 messages🔥🔥):

Attention Mechanisms, Scaling Laws in LLMs, Hybrid Architectures in Transformers, Forgetting Transformer, Long-Context Performance

Links mentioned:


Eleuther ▷ #interpretability-general (3 messages):

OpenAI Deep Research, ML/AI Literature Reviews, Research Grounding Issues


GPU MODE ▷ #general (8 messages🔥):

Profiling talk recording, Zoom session feedback, YouTube stream


GPU MODE ▷ #triton (11 messages🔥):

Fused MM Activation Implementation, GEMM Performance Insights, Kernel Caching Strategies, Triton Conference 2025, CUDA Thread Inquiry


GPU MODE ▷ #cuda (19 messages🔥):

Tensor Memory Management, GPU Access Issues, Torch Distributed Training Errors, CUDA Technology Relevance

Links mentioned:


GPU MODE ▷ #torch (2 messages):

Fast Hadamard Transform, SageAttention2, Huggingface Transformers ONNX issue

Links mentioned:


GPU MODE ▷ #announcements (1 messages):

NVIDIA Profiling Tools, Magnus Strengert Talk

Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ...


GPU MODE ▷ #cool-links (1 messages):

Roofline Model, Hierarchical Analysis


GPU MODE ▷ #jobs (1 messages):

Oumi AI, Open-source models, ML performance engineers hiring, Collaborative AI development

Links mentioned:


GPU MODE ▷ #beginner (2 messages):

Fine-tuning Transformer Models, Colab GPU Issues, Alternatives to Colab for Training, Modal Platform Discussion


GPU MODE ▷ #torchao (1 messages):

mubappe.: Yes resolved, thanks


GPU MODE ▷ #off-topic (8 messages🔥):

Llama 3.3 License Issues, Llama Model Availability, Documentation and Code Sharing

Links mentioned:


GPU MODE ▷ #liger-kernel (1 messages):

User Defined Kernels, FSDP Usage


GPU MODE ▷ #self-promotion (19 messages🔥):

CUDA Kernel Optimizations, Low Bit Training Presentation, FP8 Training, Cohere AI YouTube Webinar, Polish LLM Training Pipeline

Links mentioned:


GPU MODE ▷ #avx (1 messages):

alint5215: it beats openblas now.


GPU MODE ▷ #🍿 (5 messages):

Inference-time scaling, DeepSeek-R1 model, CuDNN frontend for flex attention, NVIDIA's performance benchmarks

Links mentioned:


GPU MODE ▷ #reasoning-gym (114 messages🔥🔥):

Futoshiki dataset updates, Eval architecture discussions, Whitespace in answers, Scoring methods, Evaluation process improvements

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (13 messages🔥):

API usage field update, Tokenization across models, Provider outages, OpenAI model availability, Model suffixes and functionalities

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (163 messages🔥🔥):

DeepSeek R1 Performance, Error Issues with API Keys, Self-Moderated OpenAI Endpoints, New Model Introductions, Rate Limiting Concerns

Links mentioned:


OpenAI ▷ #ai-discussions (122 messages🔥🔥):

Perplexity Deep Research, AI Model Opinions, Use of Wolfram Alpha, ChatGPT User Experience, AI News Sources


OpenAI ▷ #gpt-4-discussions (10 messages🔥):

Free Plan Limits, GPT Store Publishing, Privacy Policy Requirement


OpenAI ▷ #prompt-engineering (10 messages🔥):

Using ChatGPT vs Playground, Interpretation of prompts, JSON vs plain text formats, Legislative writing with AI, Importance of human oversight


OpenAI ▷ #api-discussions (10 messages🔥):

Using ChatGPT vs Playground, Interpreting prompts, Prompt formats, Legislative writing prompts, AI model confidence


Stability.ai (Stable Diffusion) ▷ #general-chat (135 messages🔥🔥):

Stable Diffusion Models, Lora Training Tips, Audio Device Recognition, Controlled Image Generation, Community Engagement

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (53 messages🔥):

Open Weight Definition, DeepHermes-3 Preview, EnigmaEval Launch, AI Security Institute, xAI Data Center Plans

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (8 messages🔥):

notebookLM performance, GPT-5 model interface, reasoning models training


Interconnects (Nathan Lambert) ▷ #ml-drama (2 messages):

DH3 Evaluation Metrics, Distill Release Comparison, Company Legitimacy Discussions

Link mentioned: Tweet from kalomaze (@kalomaze): dh3 notes1. they only show these two specific evals for the "reasoning on"; the "reasoning off" chart is the only one showing all metrics2. they don't compare to the official 8b di...


Interconnects (Nathan Lambert) ▷ #random (41 messages🔥):

Boomer prompts and O-series models, DeepSeek-R1 deployment, Academic writing evolution, David Perrell's writing advice, Tülu 3 presentation at DLCT

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (4 messages):

Alignment discussions, OpenAI O1 Pro mode inquiries

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (7 messages):

Reasoning model ideas, Fun experiments in AI, GRPO and KL=0, Training metrics and entropy


Interconnects (Nathan Lambert) ▷ #reads (9 messages🔥):

Zvi's writing style, Long-form content, Historical commentary, AI perspective on commentary


Notebook LM ▷ #use-cases (11 messages🔥):

Notebook LM for Study, Gen Z Social Media Slang Customization, Quality of Responses in Notebook LM, Use Cases for Fantasy Novel Writing, Podcast Functionality Queries


Notebook LM ▷ #general (75 messages🔥🔥):

Notebook LM Language Support, Notebook LM PDF Upload Issues, Notebook LM Subscription Changes, Notebook LM Document Sharing, Gemini Model Functionality

Links mentioned:


Latent Space ▷ #ai-general-chat (58 messages🔥🔥):

Latent Reasoning in LLMs, Veo 2 Video Generation Model, New Apple Products, DeepHermes 3 LLM, Beekeeping Feasibility Report

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

swyxio: new pod drop! https://x.com/latentspacepod/status/1890101440615453025


LlamaIndex ▷ #blog (2 messages):

LlamaIndex Google Cloud Integration, LlamaParse Features


LlamaIndex ▷ #general (49 messages🔥):

AgentWorkflow for RAG, Using uv for virtual environments, LlamaIndex updates, Outdated packages management, Python functions in workflows

Links mentioned:


LlamaIndex ▷ #ai-discussion (3 messages):

Model Finetuning, AI Community Growth, Quantum Education Initiatives

Links mentioned:


MCP (Glama) ▷ #general (44 messages🔥):

OpenRouter vs Glama, Issues with OpenWebUI, Using 0.0.0.0 in Networking, Instructions for Setup, Community Discussions on MCP Server Roles

Links mentioned:


MCP (Glama) ▷ #showcase (4 messages):

Zonos TTS MCP, Intonation Control for Claude, Use of SSML Tags, Markdown vs SSML, Text-to-Speech Models

Link mentioned: GitHub - PhialsBasement/Zonos-TTS-MCP: MCP server that allows Claude to have a voice.: MCP server that allows Claude to have a voice. Contribute to PhialsBasement/Zonos-TTS-MCP development by creating an account on GitHub.


Yannick Kilcher ▷ #general (38 messages🔥):

Evaluating RAG Systems, Tinystories Pretraining, Generative Models and RL, Pretraining on Consumer Hardware, Logits in Model Pipelines

Link mentioned: Minimum Width for Universal Approximation: The universal approximation property of width-bounded networks has been studied as a dual of classical universal approximation results on depth-bounded networks. However, the critical width...


Yannick Kilcher ▷ #paper-discussion (2 messages):

Weekly Crunch Time, Future Meeting Plans


Yannick Kilcher ▷ #agents (2 messages):

New AI Model without tokens, Latent space reasoning model, 3.5 billion parameter model

Links mentioned:


Yannick Kilcher ▷ #ml-news (2 messages):

Elon Musk, Open PTLMs, Model Registry Challenges

Link mentioned: Towards Semantic Versioning of Open Pre-trained Language Model Releases on Hugging Face: The proliferation of open Pre-trained Language Models (PTLMs) on model registry platforms like Hugging Face (HF) presents both opportunities and challenges for companies building products around them....


tinygrad (George Hotz) ▷ #general (15 messages🔥):

PR Submission Guidelines, Kernel and OptOps Understanding, VIZ on WSL Issues


DSPy ▷ #general (11 messages🔥):

DSPy vs LangChain, DSPy 2.6 Change Log, Removal of DSPy Assertions, Multi-label Classification with DSPy, DSPy Code Golf

Link mentioned: Tweet from Omar Khattab (@lateinteraction): Sometimes I look for an excuse to spend some 5 minutes on some neat DSPy golf.Someone was asking: How do I use DSPy for extraction of structured data from HTML? Hmm, but that's a one-liner.What if...


Modular (Mojo 🔥) ▷ #general (1 messages):

Valentine's Day, MAX and Mojo


Modular (Mojo 🔥) ▷ #mojo (7 messages):

Memory Error in Function Call, Release of v25.1, Dog/Cat Example Confusion, Larecs GitHub Repository, Safe Mutable Aliasing Document


Nomic.ai (GPT4All) ▷ #general (8 messages🔥):

Token banning in configuration, Deepseek model recommendations, Fine-tuning LLMs, TradingView access

Link mentioned: Reddit - Dive into anything: no description found


Torchtune ▷ #dev (3 messages):

RFC for Dataloader Transform, Online DPO/GRPO Data Generation, Prompt to Preference Function


Torchtune ▷ #papers (2 messages):

Distillation Scaling Laws, Quantization-Aware Training, QuEST Method, Sparse Representations in LLMs

Links mentioned:


Cohere ▷ #discussions (4 messages):

Cohere Command R+


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

Quiz 3 release


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (1 messages):

AI/ML Guidance, Model Training Techniques




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}