Frozen AI News archive

not much happened today

**rStar-Math** surpasses **OpenAI's o1-preview** in math reasoning with **90.0% accuracy** using a **7B LLM** and **MCTS** with a **Process Reward Model**. **Alibaba** launches **Qwen Chat** featuring **Qwen2.5-Plus** and **Qwen2.5-Coder-32B-Instruct** models enhancing vision-language and reasoning. **Microsoft** releases **Phi-4**, trained on **40% synthetic data** with improved pretraining. **Cohere** introduces **North**, a secure AI workspace integrating **LLMs**, **RAG**, and automation for private deployments. **LangChain** showcases a company research agent with multi-step workflows and open-source datasets. **Transformers.js** demos released for text embeddings and image segmentation in JavaScript. Research highlights include **Meta Meta-CoT** for enhanced chain-of-thought reasoning, **DeepSeek V3** with recursive self-improvement, and collaborative AI development platforms. Industry partnerships include **Rakuten** with **LangChain**, **North** with **RBC** supporting 90,000 employees, and **Agent Laboratory** collaborating with **AMD** and **Johns Hopkins**. Technical discussions emphasize **CUDA** and **Triton** for AI efficiency and evolving AI-assisted coding stacks by **Andrew Ng**.

Canonical issue URL

AI News for 1/8/2025-1/9/2025. We checked 7 subreddits, 433 Twitters and 32 Discords (219 channels, and 2928 messages) for you. Estimated reading time saved (at 200wpm): 312 minutes. You can now tag @smol_ai for AINews discussions!

Congrats to all seven billionaire cofounders of Anthropic.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Models & Benchmarks

AI Tools & Platforms

AI Research & Studies

AI Industry Partnerships

Technical Discussions & Development

Memes & Humor

** AI Community & Events**


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Groq's Handling of Models: Insights and Comparisons

Theme 2. Phi-4 Performance: Benchmark vs Real-World Tasks

Theme 3. NVIDIA Project DIGITS Memory Bandwidth Speculation

Theme 4. TransPixar: Transparency-Preserving Generative Models

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. Salesforce's AI Strategy: Ending Software Engineer Hires by 2025

Theme 2. ChatGPT Losing It: Recognizing Anthropic-Type Mistakes

Theme 3. Conspiracy Claims: OpenAI's Erasure of Former Employee Data


AI Discord Recap

A summary of Summaries of Summaries by o1-2024-12-17

Theme 1. Model Showdowns & Surprises

Theme 2. Coding Tools & HPC Upgrades

Theme 3. Cutting-Edge Prompting & Decoding

Theme 4. HPC & GPU Revelations

Theme 5. Big Hackathons & Corporate Shifts


PART 1: High level Discord summaries

Stability.ai (Stable Diffusion) Discord


Unsloth AI (Daniel Han) Discord


Codeium (Windsurf) Discord


Cursor IDE Discord


Stackblitz (Bolt.new) Discord


aider (Paul Gauthier) Discord


Notebook LM Discord Discord


LM Studio Discord


OpenAI Discord


Interconnects (Nathan Lambert) Discord


Eleuther Discord


GPU MODE Discord


Nous Research AI Discord


Latent Space Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


Cohere Discord


tinygrad (George Hotz) Discord


Nomic.ai (GPT4All) Discord


LlamaIndex Discord


Modular (Mojo 🔥) Discord


LLM Agents (Berkeley MOOC) Discord


OpenInterpreter Discord


LAION Discord


DSPy Discord


AI21 Labs (Jamba) Discord


Torchtune Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Axolotl AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Stability.ai (Stable Diffusion) ▷ #general-chat (719 messages🔥🔥🔥):

ComfyUI Features, OpenPose Control in Pony, Electrical Outages Impact on SD, Updates for AI Tools, Using AMD GPUs with Different Interfaces

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (393 messages🔥🔥):

Phi-4 Bug Fixes, Unsloth Model Deployment, Chat Templates in LLMs, Adapting Models for Inference, Quantization Impacts

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

Job Search Success, Funny GIFs

Link mentioned: Amogus6969 GIF - Amogus6969 - Discover & Share GIFs: Click to view the GIF


Unsloth AI (Daniel Han) ▷ #help (48 messages🔥):

Mathstral-7B-v0.1 Model Limitations, Model Suggestions for Tabular Calculations, Training for Longer Contexts, Merging LoRA Models, Classical ML for Name Splitting

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (4 messages):

DLSS for Language Models, Speculative Decoding


Codeium (Windsurf) ▷ #discussion (125 messages🔥🔥):

Codeium Self-Hosted Version, Windsurf Performance Issues, Cascade Model Benefits, Custom Model Training, Prompt Credit Usage

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (140 messages🔥🔥):

Windsurf Installation Experiences, Cascade Panel Issues, Flow Credits and Billing Concerns, Agent Integration with Windsurf, Update Feedback

Links mentioned:


Cursor IDE ▷ #general (246 messages🔥🔥):

Cursor composer issues, Claude performance, Cursor rules usage, Community feedback, Cursor documentation

Links mentioned:


Stackblitz (Bolt.new) ▷ #prompting (11 messages🔥):

Prompting Techniques, Payment System Issues, Public Repos Feature, Sleep Schedule Jokes, Subreddit AI Promotional Post

Links mentioned:


Stackblitz (Bolt.new) ▷ #discussions (211 messages🔥🔥):

Bolt Performance Issues, PWA Development, Token Management, Integration Concerns, GitHub Deployment Issues

Links mentioned:


aider (Paul Gauthier) ▷ #general (66 messages🔥🔥):

AI Editor Comparisons, Aider and O1 Performance, Discussion on AI Capabilities, Development Contributions to Aider, Usage of OpenAI Models

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (61 messages🔥🔥):

Aider Configuration Issues, OpenAI Model Access, DeepSeek Performance, Task Management Techniques, Context Management in AI Models

Links mentioned:


aider (Paul Gauthier) ▷ #links (1 messages):

Gemini 2.0 Flash Experimental, App Development Assistance, Voice Mode Interaction


Notebook LM Discord ▷ #use-cases (19 messages🔥):

DeepResearch Reports, Quotation Mode in NotebookLM, Podcasts from English to Mandarin, System Prompt in NotebookLM Plus, Creative Podcasting Prompts

Links mentioned:


Notebook LM Discord ▷ #general (94 messages🔥🔥):

Notebook LM Usage Issues, Podcast Generation, Workspace License Troubles, AI Tool Features, Language Support

Links mentioned:


LM Studio ▷ #general (66 messages🔥🔥):

LM Studio related issues, Model loading problems, Directory structure for models, Announcement of Qwen Chat, Insights on LLM application development

Links mentioned:


LM Studio ▷ #hardware-discussion (33 messages🔥):

AMD RX 7900XT performance, External GPU options for MacBook Pro, Finding system bottlenecks, Memory configuration for ML models, Availability of DIGITS


OpenAI ▷ #ai-discussions (60 messages🔥🔥):

TensorFlow GPU Issues, Model Safety Concerns, Best YouTube Channels for Machine Learning, Jupyter Notebook vs Python File, Environment Setup for TensorFlow


OpenAI ▷ #gpt-4-discussions (7 messages):

GPT code handling, Graph generation


OpenAI ▷ #prompt-engineering (13 messages🔥):

Meta-Prompting, Investor Round for Hassabis, Prompt Engineering Concepts, OpenAI's Financial Returns


OpenAI ▷ #api-discussions (13 messages🔥):

Meta-Prompting, OpenAI's Approach to Prompting, Investor Round for Hassabis, Community Engagement in AI, Creating Effective Prompts


Interconnects (Nathan Lambert) ▷ #events (3 messages):

ICLR Attendance, Meetup Details


Interconnects (Nathan Lambert) ▷ #news (19 messages🔥):

rStar-Math improvements, O1 vs GPT4o + MCTS, Qwen Chat launch, Chinese AI interview insights

Links mentioned:


Interconnects (Nathan Lambert) ▷ #other-papers (21 messages🔥):

NuminaMath dataset, Lead authors' backgrounds, Psychology and business degrees, Quality of open data, High school competition

Link mentioned: Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought: We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT....


Interconnects (Nathan Lambert) ▷ #ml-questions (11 messages🔥):

Complexity in Large Scale Models, Transformers vs MoEs


Interconnects (Nathan Lambert) ▷ #random (17 messages🔥):

AI Alignment Discussion, Post-training Model Shaping, Imposter Syndrome in AI Fields, Blog Publishing Challenges

Link mentioned: How difficult is AI alignment? | Anthropic Research Salon: At an Anthropic Research Salon event in San Francisco, four of our researchers—Alex Tamkin, Jan Leike, Amanda Askell and Josh Batson—discussed alignment scie...


Interconnects (Nathan Lambert) ▷ #reads (3 messages):

Efficient Deep Learning, Pop-up Interference, Navigation Issues on the Blog

Link mentioned: Alex L. Zhang | A Meticulous Guide to Advances in Deep Learning Efficiency over the Years: A very long and thorough guide how deep learning algorithms, hardware, libraries, compilers, and more have become more efficient.


Interconnects (Nathan Lambert) ▷ #posts (14 messages🔥):

AI Cost Concerns, Open Source AI, Policy Maker Reactions

Link mentioned: Tweet from Teortaxes▶️ (@teortaxesTex): @natolambert I agree on substance but why do you present this as some debunking? They say right there that GPU-hours*$/hr does not include their total capex, R&D expenses, or data gen.(and it's me...


Eleuther ▷ #general (33 messages🔥):

GPT-NeoX vs Nvidia NeMo, SmolLM Corpus Upload, SciAgents Research Discussion, Modal for Model Training, DL Framework Usability vs Performance

Links mentioned:


Eleuther ▷ #research (42 messages🔥):

Grokking phenomenon, Weight decay in LLMs, Softmax and scaling issues, Alternative loss functions, Attention mechanisms

Links mentioned:


Eleuther ▷ #gpt-neox-dev (6 messages):

Llama 2 pretraining issues, Memory profiling for GPU usage, Model parallelism configurations, SLURM setups and outputs

Links mentioned:


GPU MODE ▷ #general (10 messages🔥):

NCU profile comparison, Scam prevention advice, Learning Triton/CUDA, Options for simulated distributed training, Long context benchmarking


GPU MODE ▷ #triton (8 messages🔥):

WGMMA Computation Requirement, Triton Implementations of Fused MLP, Profiling Triton Operations, Proton Profiler, Torch.device for Triton Examples

Links mentioned:


GPU MODE ▷ #cuda (14 messages🔥):

CUDA Driver Importance, Memory Banking Lectures, Writing CUDA Kernels, Blackwell vs. Hopper in CUDA, CUDA File Upload Tips


GPU MODE ▷ #jobs (2 messages):

Nectar Social job openings, European consultancy in GPU and HPC


GPU MODE ▷ #beginner (3 messages):

CUDA installation on Ubuntu, Starting AI projects on MacBook without NVIDIA GPU, Using cloud providers for CUDA

Link mentioned: CUDA Installation Guide for Linux: no description found


GPU MODE ▷ #off-topic (1 messages):

kashimoo: my gf says i sleep talk about CUDA 😭


GPU MODE ▷ #rocm (24 messages🔥):

MI210 Compute Unit Performance, Kernel Launch Optimization, Occupancy Differences in GPUs, Workgroup Size Calculations, RX7900XTX Performance Insights

Link mentioned: Optimizing GPU occupancy and resource usage with large thread groups: Sebastian Aaltonen, co-founder of Second Order Ltd, talks about how to optimize GPU occupancy and resource usage of compute shaders that use large thread groups.


GPU MODE ▷ #self-promotion (1 messages):

MicroDiT replication, DCAE autoencoder, MMDIT prompt adherence, Compute grants

Link mentioned: Tweet from sway (@SwayStar123): MicroDiT replication is complete.Download weights here: https://huggingface.co/SwayStar123/MicroDiT/blob/main/no_cfg/microdit_model_epoch_19.ptInference script here: https://github.com/SwayStar123/mic...


GPU MODE ▷ #🍿 (2 messages):

Alpha competition, Softmax kernel performance


GPU MODE ▷ #thunderkittens (3 messages):

ThunderKittens GitHub repo, Collaboration on kernel development, CPP performance metrics

Link mentioned: ThunderKittens/tests/python at main · HazyResearch/ThunderKittens: Tile primitives for speedy kernels. Contribute to HazyResearch/ThunderKittens development by creating an account on GitHub.


GPU MODE ▷ #arc-agi-2 (7 messages):

ARC Prize Non-Profit Transition, Rejection Sampling Experiment, Text-Domain Exploration, Meta CoT Paper Insights, Positional Encoding Impact

Links mentioned:


Nous Research AI ▷ #general (47 messages🔥):

Contributing GPU to Training, DisTrO Open Sourcing, DeepSeek V3 Differences, Hermes Model Censorship, Cursor vs IDEs


Nous Research AI ▷ #ask-about-llms (2 messages):

Reducing memory usage in models, Open source function calling models, Function calling accuracy benchmarks


Nous Research AI ▷ #research-papers (3 messages):

Research ideas and papers, Carson Poole's projects

Link mentioned: Carson Poole's Personal Site: no description found


Nous Research AI ▷ #interesting-links (11 messages🔥):

Qwen 7B performance, Self-reflection in models, Math reasoning capabilities, LLMs usefulness in math, Reliability of LLMs

Link mentioned: Tweet from Alex Volkov (Thursd/AI) (@altryne): Ugh guys... Microsoft just made Qwen 7B solve AIME at the level of o1 😵‍💫 They also showed that with their MCTS driver process, there was self-reflection capability like with reasoning models. Will ...


Nous Research AI ▷ #research-papers (3 messages):

Progress on Ideas, Research Ideas List, Carson Poole's Contributions

Link mentioned: Carson Poole's Personal Site: no description found


Latent Space ▷ #ai-general-chat (47 messages🔥):

Salesforce hiring freeze, OpenAI custom instructions update, Anthropic funding round and valuation, Google AI product merger, Moondream model release

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

AI Agent Hackathon, OpenRouter API credits, Live Agent Studio competition, Prize pool increase, Registration details

Link mentioned: oTTomator: no description found


OpenRouter (Alex Atallah) ▷ #general (46 messages🔥):

OpenRouter Performance Issues, O1 API Response Format, Gemini Flash Performance, Hanami Usage, Crypto Payments


Perplexity AI ▷ #announcements (1 messages):

CSV Downloads, Table Responses


Perplexity AI ▷ #general (33 messages🔥):

Youzu.ai design tool, Perplexity user issues, Collaboration project proposal, Ecosia partnership inquiry, Perplexity optimization tips

Links mentioned:


Perplexity AI ▷ #sharing (6 messages):

Toyota exploring rockets, Upcoming video game releases, IndyCar driver averages, Average lifespan of Spaniards, NVIDIA supercomputer for home use


Perplexity AI ▷ #pplx-api (3 messages):

Korean language API usage, Model alternatives to Llama-3.1, Discord discussion links


Cohere ▷ #discussions (2 messages):

North AI Workspace, Cohere Launch Events, Productivity Tools

Links mentioned:


Cohere ▷ #questions (7 messages):

Command R+ for Generative Models, Guidelines for Upgrading Embeddings, Classification Model Error Handling, Alignment Evals Hackathon


Cohere ▷ #api-discussions (26 messages🔥):

Cohere LLM API recursive output issue, Generating long reports with Cohere, Handling token limits in model outputs, API rate limit errors, Setting auto mode for generating context


Cohere ▷ #projects (2 messages):

Discord Channel Rules


tinygrad (George Hotz) ▷ #general (18 messages🔥):

Bounty for PR #8505, LLVM JIT and Autogen Discussion, Stability of LLVM API, Contributions to Tinygrad

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

TinyGrad Blog Overview, Initializing Layers with Device Specification, Device Options in TinyGrad

Link mentioned: TinyGrad Codebase Explained-ish: A detailed-ish explanation of TinyGrad’s repository structure and key files


Nomic.ai (GPT4All) ▷ #general (22 messages🔥):

Nvidia performance with GPT4All, Using the phi-4 model, Local server API issues, Template setup for models, Recommendations for roleplay models

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

GitHub HQ Event, Agentic Document Workflows, AI Agents Debugging, Fast Inference Systems, LlamaIndex Workflows


LlamaIndex ▷ #general (18 messages🔥):

Ollama performance, Access control for applications, Vector database indexing, Local TEI server for reranking, QueryFusionRetriever token limit

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (18 messages🔥):

Rust syntax for Actor models, Overload resolution in Mojo, Quantum computing libraries in Mojo, MAX and quantum computing, Quojo library for quantum operations

Link mentioned: GitHub - Deftioon/Quojo: A Quantum Computing Machine written in Mojo: A Quantum Computing Machine written in Mojo. Contribute to Deftioon/Quojo development by creating an account on GitHub.


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Hackathon Results, Judging Timeline Updates


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (6 messages):

Google Form edits, Twitter account deactivation, Form submission process, Email access issues


OpenInterpreter ▷ #general (7 messages):

OI 1.0 Python Execution, AI Improvement Observations, Model and Parameters Inquiry, Custom Instructions Insights


LAION ▷ #general (5 messages):

TruLie dataset, Image-to-3D techniques, Chirpy3D, World Models, Gaussian splats

Links mentioned:


LAION ▷ #research (1 messages):

rom1504: Is there any good open tool registry for building agents ?


DSPy ▷ #general (4 messages):

Improving COT for Chatbots, Building Custom Evaluations for LLMs, Importance of Evals in AI Development, Drew Breunig's Work and Projects

Links mentioned:


AI21 Labs (Jamba) ▷ #general-chat (3 messages):

Python app with Jamba, AI code generation, PHP coding, Jamba functionality


Torchtune ▷ #general (1 messages):

jovial_lynx_74856: Anyone here tried finetuning ModernBERT?


Torchtune ▷ #jobs (1 messages):

Nectar Social hiring, Referral bounties, AI startup roles






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}