Frozen AI News archive

not much happened today

**Sam Altman** publicly criticizes **DeepSeek** and **Qwen** models, sparking debate about **OpenAI**'s innovation claims and reliance on foundational research like the **Transformer architecture**. **Deepseek V3** shows significant overfitting issues in the **Misguided Attention** evaluation, solving only **22%** of test prompts, raising concerns about its reasoning and finetuning. Despite skepticism about its open-source status, **Deepseek V3** is claimed to surpass **ChatGPT4** as an open-source model, marking a milestone 1.75 years after ChatGPT4's release on **March 14, 2023**. The discussions highlight competitive dynamics in AI model performance and innovation sustainability.

Canonical issue URL

AI News for 12/27/2024-12/30/2024. We checked 7 subreddits, 433 Twitters and 32 Discords (215 channels, and 5832 messages) for you. Estimated reading time saved (at 200wpm): 696 minutes. You can now tag @smol_ai for AINews discussions!

Enjoy the break.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

TO BE COMPLETED


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Deepseek's V3: Performance and Critique

Theme 2. Cerebras's Trillion Parameter Training on CS-3

Theme 3. Affordable Local AI: Performance on Budget GPUs

Theme 4. SmallThinker-3B: Efficient Reasoning in Small Scale Models

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. OpenAI's O1 Offers Significant Advantage in Math and Education

Theme 2. MAMBA Model's Struggle Against Transformer Dominance

Theme 3. OpenAI's AGI Definition and Economic Metrics

Theme 4. AI's Role in Gaming and Social Media


AI Discord Recap

A summary of Summaries of Summaries by o1-2024-12-17

Theme 1. AI Models Fight for Coding Supremacy

Theme 2. Fine-Tuning & LoRA Legwork

Theme 3. Quantization & HPC Performance

Theme 4. RAG, Embeddings & Agent Workflows

Theme 5. APIs, Pricing & Prompt Engineering


PART 1: High level Discord summaries

Codeium (Windsurf) Discord


Unsloth AI (Daniel Han) Discord


Cursor IDE Discord


Stackblitz (Bolt.new) Discord


aider (Paul Gauthier) Discord


Eleuther Discord


OpenRouter (Alex Atallah) Discord


Nous Research AI Discord


Perplexity AI Discord


OpenAI Discord


Notebook LM Discord Discord


Stability.ai (Stable Diffusion) Discord


Modular (Mojo 🔥) Discord


LM Studio Discord


GPU MODE Discord


Latent Space Discord


Interconnects (Nathan Lambert) Discord


Nomic.ai (GPT4All) Discord


Cohere Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


LLM Agents (Berkeley MOOC) Discord


Torchtune Discord


OpenInterpreter Discord


DSPy Discord


LAION Discord


Gorilla LLM (Berkeley Function Calling) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Axolotl AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Codeium (Windsurf) ▷ #announcements (1 messages):

Codeium 2024 Wrapped, Upcoming features

Link mentioned: Codeium Wrapped 2024 | Windsurf Editor and Codeium extensions: Check out your top languages, how much time you spent coding, your coding patterns and much more in Codeium 2024 Wrapped!


Codeium (Windsurf) ▷ #discussion (194 messages🔥🔥):

Windsurf performance issues, User login problems, Codeium pricing frustrations, Alternative IDEs, Error messages in Codeium

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (633 messages🔥🔥🔥):

Windsurf service outages, DeepSeek V3 integration, Context length issues in Windsurf, User experiences with AI code suggestions, SVG loading issues in React Native

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (705 messages🔥🔥🔥):

Fine-tuning LLM Models, Role of Tokens in Training, Open Source and Model Sharing, Quantization Issues with LLMs, Hymba Model Overview

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (8 messages🔥):

WSL Ubuntu setup, Community Gratitude, Computer Vision Projects, New Year Wishes, Server Appreciation


Unsloth AI (Daniel Han) ▷ #help (171 messages🔥🔥):

LoRA and its applications, Fine-tuning large language models, Challenges in language translation, Understanding model performance and training datasets, Learning resources for AI and LLMs

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (7 messages):

Light Prompter, Test Time Training, Weights Updating, RL Techniques, VLLM Notebooks

Link mentioned: GitHub - Green0-0/light_prompter: Accelerate test-time-compute with batching!: Accelerate test-time-compute with batching! Contribute to Green0-0/light_prompter development by creating an account on GitHub.


Cursor IDE ▷ #general (637 messages🔥🔥🔥):

Cursor IDE issues, Deepseek API usage, Chat vs Composer, Web app development, Payment methods for Cursor

Links mentioned:


Stackblitz (Bolt.new) ▷ #announcements (1 messages):

Grok AI API promotion

Link mentioned: Tweet from StackBlitz (@stackblitz): Build #GrokAI into your Bolt app!If you haven't tried it yet, today & tomorrow are THE time for it:before the year ends, every x․ai API user still gets $25 of free credits!


Stackblitz (Bolt.new) ▷ #prompting (20 messages🔥):

Bolt code update issues, Voice prompting feature request, Token wastage concerns


Stackblitz (Bolt.new) ▷ #discussions (460 messages🔥🔥🔥):

Token Consumption, Error Handling in Bolt, Using Bolt for App Development, Firebase vs Supabase, Project Management in Bolt

Links mentioned:


aider (Paul Gauthier) ▷ #general (380 messages🔥🔥):

DeepSeek V3 Performance, Aider Usage and Context Management, Gemini Models Insights, OpenRouter Integration Issues, OCR Implementation in Web Apps

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (76 messages🔥🔥):

DeepSeek V3 usage, Aider installation and configuration, Token limits with models, Git sparse-checkout compatibility, Shell command execution in Aider

Links mentioned:


Eleuther ▷ #general (31 messages🔥):

Logit equalities in HF models, Dynamic test-time temperature in LLMs, BF16 training and gradient scaling, Lipschitz-1 RMSNorm replacement


Eleuther ▷ #research (219 messages🔥🔥):

LLM Benchmarking Challenges, Gradient Routing for Neural Networks, TongGeometry for Geometry Theorem Discovery, Crosscoders for Feature Analysis, Superficial Alignment Hypothesis

Links mentioned:


Eleuther ▷ #interpretability-general (9 messages🔥):

Neural Networks as Polycomputers, TinyStories Dataset, Small Transformers, Catastrophic Interference Solutions

Links mentioned:


Eleuther ▷ #lm-thunderdome (12 messages🔥):

Scrolls benchmark issues, GSM8K strict exact match clarification, mgsm_chat troubleshooting, ZeroSCROLLS vs SCROLLS evaluation, lm_eval command usage

Link mentioned: lm-evaluation-harness/lm_eval/tasks/gsm8k.py at b281b0921b636bc36ad05c0b0b0763bd6dd43463 · EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness


OpenRouter (Alex Atallah) ▷ #general (249 messages🔥🔥):

DeepSeek V3 performance issues, OpenRouter model integration, Translation model recommendations, Building multimodal agents, LLM pricing and feature comparisons

Links mentioned:


Nous Research AI ▷ #general (74 messages🔥🔥):

DeepSeek V3 Performance, Local AI vs. API Usage, Hunyuan Video Model Limitations, SmallThinker Model Overview, LLM Development Opportunities

Links mentioned:


Nous Research AI ▷ #ask-about-llms (149 messages🔥🔥):

DeepSeek V3 performance issues, Weird behaviors in LLaMaCPP, Anthropic's reasoning models, Understanding LLaMa 3.3, High bandwidth vs home user solutions

Links mentioned:


Nous Research AI ▷ #research-papers (6 messages):

Sklearn Results Reporting, Binary Classification Metrics, Test Set Evaluation, Model Performance Trust, AUC/ROC Scores


Nous Research AI ▷ #research-papers (6 messages):

Reporting sklearn results, Metrics trustworthiness in classification, Binary classification metrics


Perplexity AI ▷ #general (203 messages🔥🔥):

Perplexity Pro Subscription, Deepseek v3 Availability, Reasoning Mode Functionality, Grant Proposal Assistance, Pro Reasoning and Search Enhancements

Links mentioned:


Perplexity AI ▷ #sharing (20 messages🔥):

Meditation Techniques, Human Brain Speed, Neurosurgery After PG in ENT, HIV Drug Breakthrough, Cold Bath Benefits

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (7 messages):

Search API Alternatives, Custom Recency Time Feature, Citations Limit, API Credit Refunds, Conversational Use of API

Link mentioned: no title found: no description found


OpenAI ▷ #ai-discussions (96 messages🔥🔥):

Image Generation Quality, AI in Coding, Gemini 2.0 Performance, Self-Employment and AI Usage, Token Limits in Content Creation

Links mentioned:


OpenAI ▷ #gpt-4-discussions (11 messages🔥):

GPT Agents Potential, GPT-2 Maximum Token Generation, Interactive App Button Features, Script Enhancement with AI Assistance

Link mentioned: Discord - Group Chat That’s All Fun & Games: Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.


OpenAI ▷ #prompt-engineering (51 messages🔥):

Sora Prompt Engineering, ChatGPT Prompting Techniques, Markdown Usage Guidelines, Course Interest in Prompt Engineering, Channel Purpose and Organization


OpenAI ▷ #api-discussions (51 messages🔥):

Sora Prompt Engineering, Prompt Engineering Courses, Markdown Use in Channels, User Engagement on Discord, ChatGPT Interaction Dynamics


Notebook LM Discord ▷ #use-cases (29 messages🔥):

NotebookLM audio usage, Embedding interactive features, Interactive mode suggestions, Handling sensitive content, YouTube video sharing

Links mentioned:


Notebook LM Discord ▷ #general (156 messages🔥🔥):

NotebookLM Plus Features, Podcast Generation Issues, Source Management Challenges, User Feedback on AI Responses, Limitations on Notebook Usage

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (146 messages🔥🔥):

M2 Max MacBook Pro for AI, Depth Maps and Banding Issues, Using Loras for Consistency, AI Video Generation Tools, Stable Diffusion Discord Community

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (138 messages🔥🔥):

Mojo Static Methods, Recursive Structs in Mojo, Performance Optimization Techniques, Memory Management of Pointers, Using ArcPointer for Self-Referential Structures

Links mentioned:


LM Studio ▷ #general (85 messages🔥🔥):

Model Performance Improvements, Vision Models and Censorship, Custom Config Implementation, Prompt Template Issues, Local Network Serving

Links mentioned:


LM Studio ▷ #hardware-discussion (24 messages🔥):

3090 NV-Link setups, Noise levels of blower GPUs, Water cooling solutions, PCIe riser issues, Jetson Orin Nano performance

Link mentioned: How Fast Does the Jetson Nano Really Run Large Language Models?: Can your Jetson Orin Nano handle the latest LLMs? We test a range of whooping models to see how fast they run.


GPU MODE ▷ #general (3 messages):

CUDA Programming, Overlap Data Transfer, CUDA Projects

Link mentioned: How to Overlap Data Transfers in CUDA C/C++ | NVIDIA Technical Blog: In our last CUDA C/C++ post we discussed how to transfer data efficiently between the host and device. In this post, we discuss how to overlap data transfers with computation on the host…


GPU MODE ▷ #triton (19 messages🔥):

Triton Installation Issues, Cross Entropy Implementations, Softmax Kernel Optimization, SpMM Kernel in Triton

Links mentioned:


GPU MODE ▷ #cuda (14 messages🔥):

TMA vs cp.async, Vectorized Load Benefits, GEMM Tutorial Series, CUDA Kernel Efficiency, Input/Output Precision in CUTLASS

Links mentioned:


GPU MODE ▷ #torch (4 messages):

Guard Performance Optimization, Debugging Slow Code


GPU MODE ▷ #algorithms (4 messages):

Power-of-2 Quantization, MAGVIT-v2 Binary Quantization, Non-Uniform Quantization Levels, ViT Model Quantization Issues


GPU MODE ▷ #jobs (1 messages):

Cracked Tech Jobs, CUDA Engineer Role, Remote LLM Infrastructure Positions, Triton Kernel Development Roles

Link mentioned: Cracked Engineers: Hire the best ai and software engineers for your startup.


GPU MODE ▷ #beginner (26 messages🔥):

Deep Learning on Linux vs Windows, Resources for Triton, NVIDIA dGPU Management on Ubuntu, Switching to Arch Linux, Success Stories with CUDA

Links mentioned:


GPU MODE ▷ #youtube-recordings (2 messages):

Scaffolding Code for Lecture 20, Scan Algorithm


GPU MODE ▷ #off-topic (1 messages):

iron_bound: https://www.youtube.com/watch?v=VpAZPPCLCUI


GPU MODE ▷ #bitnet (1 messages):

Ladder Branch Feature


GPU MODE ▷ #thunderkittens (5 messages):

Integer Matmul Operators in TK, TK vs Triton Performance Comparison, Triton Optimizer Capabilities


GPU MODE ▷ #edge (4 messages):

Raspberry Pi 5 GPU Performance, AI Project Testing on Raspberry Pi 5, Vulkan GPU Experience


Latent Space ▷ #ai-general-chat (58 messages🔥🔥):

AI-generated Code Challenges, Kagi Assistant vs. Perplexity, LLMs in Software Development, AI Engineering Summit, Cursor AI Programming Tools

Links mentioned:


Latent Space ▷ #ai-in-action-club (1 messages):

swyxio: https://news.ycombinator.com/item?id=42343692


Interconnects (Nathan Lambert) ▷ #news (4 messages):

Chatbot Arena updates, Claude's performance

Link mentioned: Tweet from lmarena.ai (formerly lmsys.org) (@lmarena_ai): Exciting News from Chatbot Arena❤️‍🔥@OpenAI's o1 rises to joint #1 (+24 points from o1-preview) and @deepseek_ai DeepSeek-V3 secures #7, now the best and the only open model in the top-10!o1 High...


Interconnects (Nathan Lambert) ▷ #ml-questions (6 messages):

Small Language Models (SLMs), The Bitter Lesson, Scaling Models


Interconnects (Nathan Lambert) ▷ #ml-drama (3 messages):

OAI employee hack, Crypto shilling, Holiday greetings


Interconnects (Nathan Lambert) ▷ #random (23 messages🔥):

DeepSeek V3 performance, Benchmarking instruction following tasks, Evaluation of model training, Interconnects market discussion, Scaling confusion in AI

Link mentioned: Tweet from Aidan McLau (@aidan_mclau): you should basically pretend that getting a model to think for longer is the same as building a bigger modelfollowing the math is quite fun and uncovers some neat things about industry progress


Interconnects (Nathan Lambert) ▷ #nlp (6 messages):

Reading Research Papers, List Growth, RLHF Experiments


Interconnects (Nathan Lambert) ▷ #rl (2 messages):

Outcome rewards, RLVR


Interconnects (Nathan Lambert) ▷ #rlhf (9 messages🔥):

GRPO, Vineppo, Memory Constraints in RL, Optimizers in RLHF


Interconnects (Nathan Lambert) ▷ #reads (6 messages):

Gary Marcus's Collaboration, AI Predictions for 2027, Discussion on AI Development Timelines

Link mentioned: Where will AI be at the end of 2027? A bet: We, Gary Marcus, author, scientist, and noted skeptic of generative AI, and Miles Brundage, an independent AI policy researcher who recently left OpenAI and is bullish on AI progress, have agreed to t...


Nomic.ai (GPT4All) ▷ #general (44 messages🔥):

API Integration with GPT4All, Updates on Nomic Models, Issues with Chat Templates, Gemini Model Support, Exploration of Vision Models

Links mentioned:


Cohere ▷ #discussions (14 messages🔥):

breathe.ai testing, finding likeminded people, HMM tokenization, internship request


Cohere ▷ #questions (5 messages):

API Rate Limits, HMM Tokenization

Link mentioned: API Keys and Rate Limits — Cohere: This page describes Cohere API rate limits for production and evaluation keys.


Cohere ▷ #api-discussions (12 messages🔥):

Image Embed Rate Limits, Fine-tuning Issues, Support Response Times


tinygrad (George Hotz) ▷ #general (16 messages🔥):

Speedup in Matching Functions, Model Rewrite Time Improvement, Meeting Discussion Points, Reversible Transformation in UOPs, Merge AM Driver Plans

Link mentioned: Happy New Year! Let's get AM merged · tinygrad/tinygrad@0addbad: no description found


tinygrad (George Hotz) ▷ #learn-tinygrad (12 messages🔥):

Tinygrad Performance vs Torch, Understanding JIT Execution, Frame Evaluation Hook API

Link mentioned: PEP 523 – Adding a frame evaluation API to CPython | peps.python.org: This PEP proposes to expand CPython’s C API 2 to allow for the specification of a per-interpreter function pointer to handle the evaluation of frames 5. This proposal also suggests adding a new field ...


LlamaIndex ▷ #blog (2 messages):

Local RAG with Llama-3.2, Neomagus for legal verification


LlamaIndex ▷ #general (18 messages🔥):

Llama 3.3 GPU Memory Requirements, RAG Solution Development, Ollama Local Model Running, LlamaParse API Details, Open Source AI Monetization

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

Filtering Nonword Sounds, Audio Editing with LLMs


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (14 messages🔥):

Certificates Distribution, Upcoming LLM Agents MOOC, Access to Course Lectures

Links mentioned:


Torchtune ▷ #general (4 messages):

Dynamo Errors, Nested Compiles, OpenAI's Simple Eval Library, Flex Changes in 2.6.0, lm eval comparison

Link mentioned: GitHub - openai/simple-evals: Contribute to openai/simple-evals development by creating an account on GitHub.


Torchtune ▷ #papers (5 messages):

FP8 quantization schemes, NVIDIA's Transformer Engine, Azure's Mixed Precision Library, FP8 block quantization, Mixed-precision training

Links mentioned:


OpenInterpreter ▷ #general (7 messages):

OS Mode Inputs, Isolation Function Clarification, Windows Build for Version 1.0, Profiles.yaml vs .py Files, Custom API Base URLs

Link mentioned: no title found: no description found


DSPy ▷ #papers (1 messages):

ari9596: Anyone have opinions on this https://arxiv.org/abs/2412.15563


DSPy ▷ #general (3 messages):

AI Glossary Creation, Exploring DSPy and Openhands Integration, Feedback Recording System for Code Changes

Link mentioned: Generating a Glossary from a Jekyll Blog Using DSPy & Claude: Asking LLMs to take the first pass at an AI glossary for my site.


LAION ▷ #general (4 messages):

FFmpeg usage, Hackathon and Conference Recommendations


Gorilla LLM (Berkeley Function Calling) ▷ #leaderboard (1 messages):

Leaderboard Techniques, API Endpoint Exceptions, Zero-shot Evaluation






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}