Frozen AI News archive

Mistral Small 3 24B and Tulu 3 405B

**Mistral AI** released **Mistral Small 3**, a **24B parameter** model optimized for local inference with low latency and **81% accuracy on MMLU**, competing with **Llama 3.3 70B**, **Qwen-2.5 32B**, and **GPT4o-mini**. **AI2** released **Tülu 3 405B**, a large finetuned model of **Llama 3** using Reinforcement Learning from Verifiable Rewards (RVLR), competitive with **DeepSeek v3**. **Sakana AI** launched **TinySwallow-1.5B**, a Japanese language model using **TAID** for on-device use. **Alibaba_Qwen** released **Qwen 2.5 Max**, trained on **20 trillion tokens**, with performance comparable to **DeepSeek V3**, **Claude 3.5 Sonnet**, and **Gemini 1.5 Pro**, and updated API pricing. These releases highlight advances in open models, efficient inference, and reinforcement learning techniques.

Canonical issue URL

AI News for 1/29/2025-1/30/2025. We checked 7 subreddits, 433 Twitters and 34 Discords (225 channels, and 7312 messages) for you. Estimated reading time saved (at 200wpm): 744 minutes. You can now tag @smol_ai for AINews discussions!

In a weird twist of fate, the VC backed Mistral ($1.4b raised to date) and the nonprofit AI2 released a small Apache 2 model and a large model today, but they are not in the order that you would expect to go in funding.

First, Mistral Small 3, released via their trademark magnet link, but thankfully also blogpost:

image.png

A very nice 2025 update to Mistral's offering optimized for local inference - though one notices that the x axis of their efficiency chart is changing more quickly than the y axis. Internet sleuths have already diffed the architectural differences from Mistral Small 2 (basically scaling up dimensionality but reducing layers and heads for latency):

image.png

Their passage on usecases is interesting information as to why they felt this worth releasing:

image.png

Next, AI2 released Tülu 3 405B, their large finetune of Llama 3 that uses their Reinforcement Learning from Verifiable Rewards (RVLR) recipe (from the Tulu 3 paper) to make it competitive with DeepSeek v3 in some dimensions:

image.png

Unfortunately there don't seem to be any hosted APIs at launch, so it is hard to try out this beeg model.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Gemini 2.0 Flash

Model Releases and Updates

Tools, Benchmarks, and Evaluations

AI Infrastructure and Compute


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Mistral Small 3 Released: Competitive with Larger Models

Theme 2. Nvidia Reduces FP8 Training on RTX 40/50 GPUs

Theme 3. DeepSeek R1 Performance: Effective on Local Rigs

Theme 4. Mark Zuckerberg on Llama 4 Progress and Strategy

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. DeepSeek-R1's Impact: Technical and Competitive Analysis

Theme 2. Copilot's AI Model Integration and User Feedback

Theme 3. ChatGPT's Latest Updates: User Experience and Technical Changes


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Exp (gemini-2.0-flash-exp)

1. DeepSeek's Rise: Speed, Leaks, and OpenAI Rivalry

2. Small Models Make Big Waves: Mistral and Tülu

3. RAG and Tools: LM Studio and Agent Workflow

4. Hardware and Performance: GPUs and Optimization

5. Funding, Ethics, and Community Buzz


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Perplexity AI Discord


Codeium (Windsurf) Discord


OpenAI Discord


LM Studio Discord


aider (Paul Gauthier) Discord


Cursor IDE Discord


Nous Research AI Discord


Yannick Kilcher Discord


Interconnects (Nathan Lambert) Discord


Eleuther Discord


OpenRouter (Alex Atallah) Discord


Stackblitz (Bolt.new) Discord


Stability.ai (Stable Diffusion) Discord


GPU MODE Discord


Nomic.ai (GPT4All) Discord


MCP (Glama) Discord


Notebook LM Discord Discord


Modular (Mojo 🔥) Discord


Latent Space Discord


LLM Agents (Berkeley MOOC) Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Cohere Discord


DSPy Discord


Axolotl AI Discord


OpenInterpreter Discord


Torchtune Discord


LAION Discord


MLOps @Chipro Discord


Gorilla LLM (Berkeley Function Calling) Discord


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (1053 messages🔥🔥🔥):

DeepSeek R1 performance, Mistral Small 24B, Fine-tuning strategies, Quantization, Unsloth capabilities

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (16 messages🔥):

Text-to-Image Servers, Model Training Issues, Frontend Imperfections, Sensitive Topic Adjustments


Unsloth AI (Daniel Han) ▷ #help (202 messages🔥🔥):

DeepSeek R1 Models, Fine-tuning Challenges, Quantization Techniques, System Requirements for Models, Inference and Performance

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (2 messages):

Online DPO, Memory consumption in AI, Unsloth project


Unsloth AI (Daniel Han) ▷ #research (13 messages🔥):

Fine-tuning MusicGen, RL-based training frameworks, Unsloth and vllm comparison, Neural magic in vllm, Collaboration between vllm and Unsloth


Perplexity AI ▷ #general (993 messages🔥🔥🔥):

Perplexity Pro features, Model performance comparison, Issues with O1 and R1, DeepSeek functionality, AI usage for academic support

Links mentioned:


Perplexity AI ▷ #sharing (12 messages🔥):

DeepSeek and OpenAI, Alibaba New Model, Doomsday Clock Update, Nike Snakeskin Red Shoes, Near-Earth Asteroid Discovery

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (4 messages):

Sonar-Reasoning Model Performance, Response Quality Issues, Repeated Answers Concern


Codeium (Windsurf) ▷ #announcements (1 messages):

Cascade Models Update, DeepSeek-R1 and DeepSeek-V3, Input Lag Reductions, Web Search Capabilities, Changelog Insights

Links mentioned:


Codeium (Windsurf) ▷ #discussion (65 messages🔥🔥):

Codeium Issues, DeepSeek vs Sonnet, Windsurf Feature Requests, Cascade Performance, Android Virtual Device

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (707 messages🔥🔥🔥):

DeepSeek R1 Implementation, Windsurf Performance and Issues, Comparison of AI Models, Pricing and Credits, User Experiences with Windurf and DeepSeek

Links mentioned:


OpenAI ▷ #ai-discussions (474 messages🔥🔥🔥):

DeepSeek vs. OpenAI Models, AI Detectors and Education Solutions, Creative AI Model Performance, Open Source AI Developments, AI Context Windows and Usability

Links mentioned:


OpenAI ▷ #gpt-4-discussions (68 messages🔥🔥):

Next Generation AI Instructions, Memory Function in GPT Models, API and Custom GPT Limitations, OpenAI's Model Release Intentions, Fine-tuning Ollama Models


OpenAI ▷ #prompt-engineering (25 messages🔥):

AI Problem-Solving Limitations, Issues with Visual Recognition, Prompt Construction Tools, Understanding of Math Puzzles


OpenAI ▷ #api-discussions (25 messages🔥):

Challenges with AI Problem Solving, AI Response Length and Quality, Vision Model Limitations, OneClickPrompts Extension, Algebra Discussion on Social Media


LM Studio ▷ #announcements (1 messages):

LM Studio 0.3.9 features, Idle TTL functionality, Reasoning content in API responses, Auto-update for LM runtimes, Support for nested folders in Hugging Face

Links mentioned:


LM Studio ▷ #general (308 messages🔥🔥):

DeepSeek Models, LM Studio Features, RAG in LM Studio, Model Performance and Reasoning, API and UI Discussion

Links mentioned:


LM Studio ▷ #hardware-discussion (203 messages🔥🔥):

DeepSeek Model Performance, Jetson Nano Discussion, Model Selection for Coding, Hardware Configuration for AI, Temperature Settings for Coding

Links mentioned:


aider (Paul Gauthier) ▷ #general (430 messages🔥🔥🔥):

DeepSeek R1 performance, O1 Pro usage, Aider integration with local models, O3 Mini release, Quantization effects on models

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (45 messages🔥):

Aider context inclusion, Azure AI deployment issues, Model configuration challenges, Using Aider in different modes, File creation prompts in Aider

Link mentioned: Advanced model settings: Configuring advanced settings for LLMs.


aider (Paul Gauthier) ▷ #links (11 messages🔥):

DeepSeek database leak, Aider Read-Only Stubs, Aider Awesome GitHub Repository, Pull Request Improvements, Bash One-Liners

Links mentioned:


Cursor IDE ▷ #general (456 messages🔥🔥🔥):

DeepSeek R1, MCP Support, Token Usage in Chat and Composer, Local Models, Security Risks in AI Models

Links mentioned:


Nous Research AI ▷ #general (295 messages🔥🔥):

Nous x Solana Event, New Model Releases, Psyche and Distributed Learning, Mistral Small Model Announcement, Community Insights on AI Agents

Links mentioned:


Nous Research AI ▷ #ask-about-llms (1 messages):

Autoregressive generation on CLIP embeddings, Multimodal inputs, Stable Diffusion generation


Nous Research AI ▷ #interesting-links (2 messages):

China's AI Models, AI Race, Top-tier Models

Link mentioned: Tweet from Deedy (@deedydas): China's only good AI model is not DeepSeek.There are TEN top tier models all trained from scratch (equal to or better than Europe / Mistral's biggest model).The US has only 5 labs—OpenAI, Anth...


Yannick Kilcher ▷ #general (170 messages🔥🔥):

Reinforcement Learning vs. Deep Learning, DeepSeek developments, Learning strategies in LLMs, Pretraining and fine-tuning frameworks, Educational analogies for LLM training

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (39 messages🔥):

OpenAI Allegations, AI Technology Concerns, Dario Amodei's Blog Post, AI Safety Funding, Daily Paper Discussion

Links mentioned:


Yannick Kilcher ▷ #agents (7 messages):

PydanticAI, LlamaIndex, LangChain, Model Performance, Future Agent Frameworks


Yannick Kilcher ▷ #ml-news (64 messages🔥🔥):

DeepSeek IP Controversy, EU AI Strategy Reactions, Mistral Small 3 Launch, Tülu 3 405B Release, Multi-Language Training Challenges

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (65 messages🔥🔥):

Tülu 3 405B Launch, Mistral Small 3 Announcement, DeepSeek Database Exposure, OpenAI Presentation in Washington, SoftBank Investment Talks with OpenAI

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (28 messages🔥):

Meta's Legal Challenges, V3 Licensing Issues, Concerns about Model Deployment, Impact of Licenses on AI Development


Interconnects (Nathan Lambert) ▷ #random (26 messages🔥):

DeepSeek R1 Launch, Speculations on Model Performance, Quantization in GPT-4, Updates on Tulu 3 Paper, Emerging Reasoning Models

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (9 messages🔥):

Teortaxes commentary, Deepseek R1 training leak, AME(R1)CA version, Mistral Small 3 architecture, Data visualization passion

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (11 messages🔥):

Tulu3 data preparation, verl vs GRPOTrainer, open-instruct implementation, HF GRPO limitations, LoRA support in open-instruct


Interconnects (Nathan Lambert) ▷ #reads (65 messages🔥🔥):

DeepSeek Math Paper, Mixture-of-Experts (MoE), Multi Token Prediction (MTP), DeepSeek v3 Architecture, Inferences and Experts Balancing

Links mentioned:


Interconnects (Nathan Lambert) ▷ #posts (2 messages):

Science Phrasing, Opinion on OAI, Metaphors in AI Discourse


Interconnects (Nathan Lambert) ▷ #policy (13 messages🔥):

Training Techniques in AI Models, Concerns about Data Sources, Deepseek Speculations, OpenAI Output Usage, Tulu Dataset in Training


Eleuther ▷ #general (31 messages🔥):

OpenAI's training ethics, RL methods and tool usage, Pythia language model sampling, Concerns over model performance, Tool dependency in LLMs

Link mentioned: Overleaf, Online LaTeX Editor: An online LaTeX editor that’s easy to use. No installation, real-time collaboration, version control, hundreds of LaTeX templates, and more.


Eleuther ▷ #research (178 messages🔥🔥):

Hyperfitting in LLMs, Critique Fine-Tuning, Backdoor Detection, Sampling Neural Networks, Generalization vs Memorization

Links mentioned:


Eleuther ▷ #gpt-neox-dev (2 messages):

DeepSpeed training issues, Intermediate dimension adjustments for gated MLPs, Llama2 config parameters

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

DeepSeek R1 Distill Qwen 32B, DeepSeek R1 Distill Qwen 14B

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (6 messages):

Subconscious AI's capabilities, Beamlit's platform features, Discord engagement

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (180 messages🔥🔥):

OpenRouter Pricing Concerns, DeepSeek R1 Model Limitations, Google AI Studio Rate Limits, Provider Issues and Downtimes, New Model Announcements

Links mentioned:


Stackblitz (Bolt.new) ▷ #announcements (1 messages):

Bolt binary asset generation, Token savings, External assets utilization

Link mentioned: Tweet from bolt.new (@boltdotnew): More tokens savings landed!Bolt's agent leverages external assets now instead of allowing the LLM to create new ones from scratch.This saves hundreds of thousands of tokens—and is orders of magnit...


Stackblitz (Bolt.new) ▷ #prompting (10 messages🔥):

Trailing Zeroes Issue with Bolt, File Update Laziness, Employer Signup Form Update, Community Use Cases for System Prompt, Supabase Signup Error Troubleshooting


Stackblitz (Bolt.new) ▷ #discussions (170 messages🔥🔥):

Supabase integration issues, Token usage concerns, Forked project challenges, CORS issue with Supabase functions, SEO meta data handling in React

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (178 messages🔥🔥):

ComfyUI Performance and Features, Hardware Discussions for AI Workloads, Reactor Tool for Face Swapping, Stable Diffusion Lora Training, Availability of New GPUs

Links mentioned:


GPU MODE ▷ #general (1 messages):

Decompression Time, Loading Weights from Disk


GPU MODE ▷ #triton (1 messages):

Triton Tensor Indexing, Using tl.gather, InterpreterError


GPU MODE ▷ #cuda (18 messages🔥):

Blackwell architecture features, sm_X features compatibility, Performance comparisons: RTX 5090 vs RTX 4090, PTX ISA documentation, Tensor Operations discussion

Link mentioned: cutlass/media/docs/blackwell_functionality.md at main · NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines. Contribute to NVIDIA/cutlass development by creating an account on GitHub.


GPU MODE ▷ #torch (2 messages):

PyTorch 2.6 release, FP16 support on X86, Deprecating Conda, Manylinux 2.28 build platform

Link mentioned: PyTorch 2.6 Release Blog: We are excited to announce the release of PyTorch® 2.6 (release notes)! This release features multiple improvements for PT2: torch.compile can now be used with Python 3.13; new performance-related kno...


GPU MODE ▷ #jobs (1 messages):

GPU Kernel Engineers, GPU Compiler Engineers, Next Gen ML Compiler, Job Openings

Link mentioned: Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team


GPU MODE ▷ #beginner (6 messages):

C++ versions, CUDA compatibility


GPU MODE ▷ #off-topic (7 messages):

RTX 5090 Availability, Homemade Meal Creations, Novelty Plates


GPU MODE ▷ #irl-meetup (1 messages):

In-Person Events, Discord Channel Updates


GPU MODE ▷ #llmdotc (4 messages):

ROOK blog post, Progress updates, Modding projects for WoW

Link mentioned: ROOK: Reasoning Over Organized Knowledge | LAION: <p>The field of artificial intelligence has long used strategic reasoning tasks as benchmarks for measuring and advancing AI capabilities. Chess, with its in...


GPU MODE ▷ #bitnet (1 messages):

Implementing new kernel languages, Low-precision kernels, Future benefits of learning


GPU MODE ▷ #self-promotion (15 messages🔥):

Mistral AIx Game Jam results, Parental Control Game, Voice Command Features, Flash Attention Implementation, Llama3-8B R1 Model Improvements

Links mentioned:


GPU MODE ▷ #thunderkittens (3 messages):

CUDA versions and TK kernels, Support for Nvidia P100 GPU


GPU MODE ▷ #arc-agi-2 (87 messages🔥🔥):

Reasoning Gym Datasets, Game of Life Challenges, Collaborative Problem-Solving, Codenames Game Mechanics, Murder Mystery Environment

Links mentioned:


Nomic.ai (GPT4All) ▷ #general (74 messages🔥🔥):

DeepSeek models, Running models with GPT4All, Integrating Ollama with GPT4All, Local document management, AI education tools

Links mentioned:


MCP (Glama) ▷ #general (47 messages🔥):

MCP Server Integration, Self-Hosted Web Clients, Cursor MCP Support, Environment Variables for MCP, Function Calling Issues

Link mentioned: env invocation (GNU Coreutils 9.6): no description found


MCP (Glama) ▷ #showcase (12 messages🔥):

Hataraku SDK Proposal, TypeScript CLI Development, Collaborative Development, User Testing Feedback

Link mentioned: hataraku/docs/sdk-proposal.md at main · turlockmike/hataraku: An autonomous coding agent and SDK for building AI-powered development tools - turlockmike/hataraku


Notebook LM Discord ▷ #announcements (1 messages):

NotebookLM Usability Study, User Experience Feedback

Link mentioned: Participate in an upcoming Google UXR study!: Hello,I’m contacting you with a short questionnaire to verify your eligibility for an upcoming usability study with Google. This study is an opportunity to provide feedback on something that's cur...


Notebook LM Discord ▷ #use-cases (5 messages):

Using AI for learning, NotebookLM Audio Overview, DeepSeek R1, Transcription for understanding, Explaining concepts in different terms

Links mentioned:


Notebook LM Discord ▷ #general (52 messages🔥):

NotebookLM Features and Performance, Audio Generation Feedback, Gemini Updates, User Experience Issues, Podcast Insights

Links mentioned:


Modular (Mojo 🔥) ▷ #announcements (1 messages):

Branch Changes, Pull Requests


Modular (Mojo 🔥) ▷ #mojo (48 messages🔥):

NeoVim LSP integration, Mojo 1.0 discussions, Backwards compatibility concerns, Reflection in Mojo, Benchmarking Mojo performance

Link mentioned: Mojo🔥: a deep dive on ownership with Chris Lattner: Learn everything you need to know about ownership in Mojo, a deep dive with Modular CEO Chris LattnerIf you have any questions make sure to join our friendly...


Latent Space ▷ #ai-general-chat (38 messages🔥):

Mistral Small 3, DeepSeek Database Leak, Riffusion's New Model, OpenAI API Latency Monitoring, ElevenLabs Series C Funding

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (8 messages🔥):

Tracks information, Sign up responses, Quiz 1 release, LLM Agents Quiz Repo, Certificate updates

Link mentioned: Quizzes Archive - LLM Agents MOOC: NOTE: The correct answers are in the black boxes (black text on black background). Highlight the box with your cursor to reveal the correct answer (or copy the text into a new browser if it’s hard to ...


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (5 messages):

Lecture Uploads, Berkeley Policies on Accessibility, Lecture Access via Website


LlamaIndex ▷ #blog (2 messages):

AI Agent Workshop, LlamaIndex on BlueSky

Link mentioned: LlamaIndex (@llamaindex.bsky.social): The framework for connecting LLMs to your data.


LlamaIndex ▷ #general (10 messages🔥):

LlamaIndex support for o1, O1 streaming issues, OpenAI model capabilities

Link mentioned: Streaming support for o1 (o1-2024-12-17) (resulting in 400 "Unsupported value"): Hello, it appears that streaming support was added for o1-preview and o1-mini (see announcement OpenAI o1 streaming now available + API access for tiers 1–5). I confirm that both work for me. Howe...


tinygrad (George Hotz) ▷ #general (11 messages🔥):

NVIDIA GPUs and Hypervisors, Interconnecting Tiny Boxes, VRAM Sharing Techniques, Performance of Tiny Boxes, Physical Server Choices for LLMs


tinygrad (George Hotz) ▷ #learn-tinygrad (1 messages):

Sample Code for Blocked/Fused Programs, Tensor Operations


Cohere ▷ #discussions (3 messages):

AI Emotional Response, Humanizing AI, Perception of AI Models


Cohere ▷ #api-discussions (1 messages):

Support Tickets, Discord Channel Communication


Cohere ▷ #cmd-r-bot (8 messages🔥):

command-r7b, Command R model, distillation frameworks


DSPy ▷ #general (6 messages):

Adding proxy to dspy.LM adapter, Supported LLMs in DSPy, Setting litellm client with http_client, Documentation references, LiteLLM model support

Links mentioned:


Axolotl AI ▷ #general (6 messages):

Axolotl for KTO, New Mistral model, User tasks and feature requests, Winter semester calendar, Mistral AI open source commitment

Link mentioned: mistralai/Mistral-Small-24B-Base-2501 · Hugging Face: no description found


OpenInterpreter ▷ #general (3 messages):

Farm Friend, Cliche Reviews


Torchtune ▷ #dev (2 messages):

DCP Checkpointing, Config Settings


LAION ▷ #general (2 messages):

img2vid tools, ltxv


MLOps @Chipro ▷ #events (1 messages):

MLOps Workshop, Feature Store on Databricks, Q&A Session, Data Engineering, Geospatial Analytics

Link mentioned: MLOps Workshop: Building a Feature Store on Databricks: Join our 1-hr webinar with Featureform's founder to learn how to empower your data by using Featureform and Databricks!


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

glitchglitchglitch: what do we need to do to make the bfcl data hf datasets compliant?




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}