Frozen AI News archive

DeepSeek-R1 claims to beat o1-preview AND will be open sourced

**DeepSeek** has released **DeepSeek-R1-Lite-Preview**, an open-source reasoning model achieving **o1-preview-level performance** on math benchmarks with transparent thought processes, showing promise in real-time problem-solving. **NVIDIA** reported a record **$35.1 billion** revenue in Q3 with **112% year-on-year data center growth**, driven by **Hopper** and **Blackwell architectures**, the latter offering **2.2x performance improvement**. **Google DeepMind** introduced **AlphaQubit**, a quantum computing system improving error correction and outperforming leading decoders, though challenges remain in scaling and speed. The AI community continues to focus on **reasoning models**, **benchmarking**, and **quantum error correction** advancements.

Canonical issue URL

AI News for 11/20/2024-11/21/2024. We checked 7 subreddits, 433 Twitters and 30 Discords (217 channels, and 1837 messages) for you. Estimated reading time saved (at 200wpm): 197 minutes. You can now tag @smol_ai for AINews discussions!

Ever since o1 was introduced (our coverage here, here, and here), the race has been on for an "open" reproduction. 2 months later, with honorable mentions to Nous Forge Reasoning API and Fireworks f1, DeepSeek appear to have made the first convincing attempt that 1) has BETTER benchmark results than o1-preview and 2) has a publicly available demo rather than waitlist.

image.png

Benchmarks wise, it doesn't beat o1 across the board, but does well on important math benchmarks and at-least-better-than-peers on all but GPQA Diamond.

image.png

Also importantly, they appear to have replicated the similar inference-time-scaling performance improvements mentioned by OpenAI, but this time with an actual x-axis:

image.png

As for the "R1-Lite" naming, rumor is (based on wechat announcements) it is based on DeepSeek's existing V2-Lite model which is only a 16B MoE with 2.4B active params - meaning that if they manage to scale it up, "R1-full" will be an absolute monster.

One notable result is that it has done (inconsistently) well on Yann LeCun's pet 7-gear question.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

1. NVIDIA Financial Updates and Market Insights

2. DeepSeek-R1-Lite-Preview: New Reasoning Model Developments

3. Quantum Computing Progress with AlphaQubit

4. Developments in GPT-4o and AI Creative Enhancements

5. AI Implementations and Tools

6. Memes/Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek R1-Lite matches o1-preview in math benchmarks, open source coming soon

Theme 2. Sophisticated Open Source LLM Tools: Research Assistant & Memory Frameworks

Theme 3. Hardware & Browser Optimization: Pi GPU Acceleration & WebGPU Implementations

Theme 4. Model Architectures: Analysis of GPT-4, Gemini & Other Closed Source Models

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

Theme 1. Live Demo Shows Real-Time AI Facial Recognition Raising Privacy Alarms

Theme 2. CogVideoX 1.5 Image-to-Video: Quality vs Performance Trade-offs

Theme 3. 10 AI Agents Collaborate to Write Novel in Real-Time

Theme 4. StepFun's 1T Param Model Rises in LiveBench Rankings


AI Discord Recap

A summary of Summaries of Summaries by O1-mini

Theme 1. Custom Model Deployments Take Center Stage

Theme 2. AI Model Performance and Optimization Soars

Theme 3. Innovative AI Research Paves New Paths

Theme 4. AI Tools Integration and Community Support Flourish

Theme 5. Cutting-Edge AI Developments Address Diverse Challenges


PART 1: High level Discord summaries

HuggingFace Discord


Interconnects (Nathan Lambert) Discord


Unsloth AI (Daniel Han) Discord


aider (Paul Gauthier) Discord


Eleuther Discord


Perplexity AI Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Stability.ai (Stable Diffusion) Discord


Notebook LM Discord Discord


Latent Space Discord


GPU MODE Discord


Nous Research AI Discord


OpenAI Discord


Cohere Discord


Torchtune Discord


tinygrad (George Hotz) Discord


Modular (Mojo 🔥) Discord


OpenAccess AI Collective (axolotl) Discord


DSPy Discord


LlamaIndex Discord


OpenInterpreter Discord


LLM Agents (Berkeley MOOC) Discord


Mozilla AI Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LAION Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

HuggingFace ▷ #general (263 messages🔥🔥):

  • Hugging Face Discord Community
  • AI and Machine Learning Projects
  • Gradio and Streamlit Integration
  • LangChain and RAG
  • General Discussion and Support Requests

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

richieghost: today-im-learning LangGraph


HuggingFace ▷ #cool-finds (10 messages🔥):

  • 3D Printing Designs
  • Generative Design Tools
  • Custom AI Model Deployment
  • AI Security Research
  • Automated AI Researcher

Links mentioned:


HuggingFace ▷ #i-made-this (4 messages):

  • Fractal Forest Creatures
  • AI in Music and Animation
  • Effective Prompting Techniques
  • Psychedelic Experience with Music
  • Neo's Journey to the 60s

Links mentioned:


HuggingFace ▷ #reading-group (4 messages):

  • 3080 GPU Pricing
  • VRAM Utilization
  • Channel Discussion Etiquette

HuggingFace ▷ #NLP (5 messages):

  • Semantic Search Challenges
  • Issues with Evaluate Library
  • Alternatives to Pandas

HuggingFace ▷ #diffusion-discussions (6 messages):

  • Diffusers Version Issues
  • CogVideoX1.5-5B-I2V Repo Updates
  • Colab Session Crashes
  • FP16 Model Loading
  • Oversampling and Downsampling Query

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (175 messages🔥🔥):

  • DeepSeek Prover
  • OpenAI o1 release
  • GPT-4o update
  • Model performance comparison
  • Community discussions on AI models

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (6 messages):

  • Francois Fleuret mention
  • Korean LLM evaluation issues
  • Japanese LLM leaderboard

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (25 messages🔥):

  • o1 Release Speculations
  • Training Dynamics in LLMs
  • Reinforcement Learning Trends
  • Model Evaluation Bottlenecks
  • Release Fatigue and Post-Release Plans

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (176 messages🔥🔥):

  • Vision Support
  • Multi-GPU Training
  • Internship Opportunities
  • Data Quality in NLP
  • Training Llama Models

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (9 messages🔥):

  • NixOS Installation
  • Fedora KDE Experience
  • Windows-like Linux Distritos
  • Checkpoint Selection in AI Training

Link mentioned: Dsa GIF - Dsa - Discover & Share GIFs: Click to view the GIF


Unsloth AI (Daniel Han) ▷ #help (10 messages🔥):

  • Fine-tuning LLMs
  • Model Export to Hugging Face
  • Pre-tokenization and Continued Pretraining
  • Inference with VLLM
  • Checkpoint Callback for Saving Models

Unsloth AI (Daniel Han) ▷ #research (1 messages):

  • SageAttention2
  • Quantized Attention
  • Inference Acceleration

Links mentioned:


aider (Paul Gauthier) ▷ #general (148 messages🔥🔥):

  • Aider Setup Challenges
  • DeepSeek Performance
  • OpenRouter Concerns
  • Model Quantization Effects
  • Coding Tools Comparisons

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (29 messages🔥):

  • Aider usage challenges
  • Chat modes best practices
  • Token limit concerns
  • Language support in Aider
  • Context extension mechanisms

Links mentioned:


Eleuther ▷ #announcements (1 messages):

  • Linear vs Affine Representation
  • ACE Method for Control in Language Models
  • Refusal Behavior in Language Models

Links mentioned:


Eleuther ▷ #general (20 messages🔥):

  • GPGPU Performance
  • PyTorch Optimization Techniques
  • Data Loading Strategies
  • GPU Memory Management

Link mentioned: cifar10-fast/bag_of_tricks.ipynb at master · davidcpage/cifar10-fast: Contribute to davidcpage/cifar10-fast development by creating an account on GitHub.


Eleuther ▷ #research (125 messages🔥🔥):

  • Latent Actions and Inverse Dynamics Models
  • nGPT Baseline Bugs
  • Use of Position Embeddings
  • Document Masking Impact on Training
  • Forgetting Transformer

Links mentioned:


Eleuther ▷ #scaling-laws (2 messages):

  • Scaling Laws in Language Models
  • Evaluation Science Advocacy
  • Marius Hobbhahn

Link mentioned: Observational Scaling Laws and the Predictability of Language Model Performance: Understanding how language model performance varies with scale is critical to benchmark and algorithm development. Scaling laws are one approach to building this understanding, but the requirement of ...


Eleuther ▷ #lm-thunderdome (15 messages🔥):

  • Zero-shot benchmarking for pruned models
  • WANDA pruning method
  • lm_eval library compatibility
  • Model evaluation on ADVBench
  • vllm model usage

Link mentioned: GitHub - locuslab/wanda: A simple and effective LLM pruning approach.): A simple and effective LLM pruning approach. Contribute to locuslab/wanda development by creating an account on GitHub.


Perplexity AI ▷ #general (132 messages🔥🔥):

  • Perplexity vs. ChatGPT
  • Referral Code Usage
  • Perplexity Shopping Feature
  • API Functionality
  • Image Creation on iOS

Links mentioned:


Perplexity AI ▷ #sharing (9 messages🔥):

  • Web App Fullstack with Next.js
  • Chicken or Egg Paradox Solved
  • Michelin Star Cities
  • NVIDIA Chips Overheat
  • Stock Monitoring for Qubit

Perplexity AI ▷ #pplx-api (1 messages):

  • Perplexity API
  • Domain Filtering

OpenRouter (Alex Atallah) ▷ #general (120 messages🔥🔥):

  • Gemini 1114 performance
  • DeepSeek updates
  • Prompt caching
  • GPT-4o model issues
  • RP model comparisons

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (6 messages):

  • Custom provider keys
  • Key integration access
  • Anthropic Claude 3.5 Sonnet
  • x-ai/grok-beta
  • xai

LM Studio ▷ #general (58 messages🔥🔥):

  • Model Loading Issues
  • System Requirements for Models
  • Optimizing Performance with Limited Hardware
  • Exploring Cloud-Based Solutions
  • Model Recommendations and Preferences

Links mentioned:


LM Studio ▷ #hardware-discussion (64 messages🔥🔥):

  • VM Performance with Qwen Models
  • Hardware Requirements for DeepSeek v2.5 Lite
  • Workstation Design for LLMs
  • GPU Selection for AI Workloads
  • Fine-tuning vs. Running Models

Link mentioned: GitHub - XiongjieDai/GPU-Benchmarks-on-LLM-Inference: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference? - XiongjieDai/GPU-Benchmarks-on-LLM-Inference


Stability.ai (Stable Diffusion) ▷ #general-chat (102 messages🔥🔥):

  • Gaming PC Recommendations
  • Consistent Character Creation
  • AI Models for Substance Designer
  • GPU Utilization for Video Generation
  • Drawing AI Demonstrations

Links mentioned:


Notebook LM Discord ▷ #use-cases (17 messages🔥):

  • Audio Generation in NotebookLM
  • External Access to Notebooks
  • Podcast Creation
  • Transcription Features
  • Customization Recommendations

Links mentioned:


Notebook LM Discord ▷ #general (35 messages🔥):

  • Combining Notes Feature
  • Reliability of Uploaded Sources
  • Sharing Notebooks
  • Deep Dive Document Generation
  • Limitations on Uploading Large Files

Latent Space ▷ #ai-general-chat (48 messages🔥):

  • DeepSeek-R1-Lite-Preview
  • GPT-4o Update
  • Truffles Hardware Device
  • Vercel Acquires Grep
  • Claude Availability Issues

Links mentioned:


GPU MODE ▷ #triton (1 messages):

cappuchinoraro: thanks


GPU MODE ▷ #beginner (5 messages):

  • Triton Tutorial Performance
  • GPU Comparisons
  • Softmax Kernel Profiling

Link mentioned: Fused Softmax — Triton documentation: no description found


GPU MODE ▷ #torchao (3 messages):

  • Readme Updates
  • Torchchat and Torchtune Linkage

Link mentioned: Update README.md by drisspg · Pull Request #1319 · pytorch/ao: no description found


GPU MODE ▷ #off-topic (2 messages):

  • Ticket Price Changes
  • Buying Tickets Early

GPU MODE ▷ #webgpu (11 messages🔥):

  • Metal GEMM Implementations
  • WebGPU and Metal Compatibility
  • Register Optimization Techniques
  • Performance Regressions in Dawn
  • AGX Machine Code Disassembly Tools

Link mentioned: Chromium: no description found


GPU MODE ▷ #liger-kernel (17 messages🔥):

  • Debugging Assistance
  • CUDA Device Mapping
  • Model Distribution across GPUs
  • Tensor Parallelism
  • Hugging Face Sharding Strategy

GPU MODE ▷ #self-promotion (4 messages):

  • FLUX inference optimization
  • CPU offloading techniques
  • GPU performance on different machines

Link mentioned: Tweet from Thien Tran (@gaunernst): Speed up FLUX CPU offloading by 200%. On 4070Ti SUPER (16GB) baseline (.enable_sequential_cpu_offload()): 3.72 s/it + pin memory: 2.09 s/it (+78%) + CUDA stream (explicit synchronization): 1.32 s/it ...


Nous Research AI ▷ #general (27 messages🔥):

  • DeepSeek-R1-Lite-Preview
  • AI agents for writing books
  • LLM knowledge evaluation

Links mentioned:


Nous Research AI ▷ #ask-about-llms (8 messages🔥):

  • Learning Rate Scheduling
  • Warmup and Decay Strategies
  • Test Time Scaling for LLMs
  • Cyclic Learning Rate Schedulers

Nous Research AI ▷ #research-papers (2 messages):

  • LLMs Reasoning Abilities
  • Generative Agent Simulations

Links mentioned:


Nous Research AI ▷ #interesting-links (3 messages):

  • Soft Prompts
  • LLM Optimization

Link mentioned: @saganite.bsky.social: Really trying to figure out why "soft prompts" aren't used more often with LLMs. For those who aren't familiar, soft prompts are system prompts that have been converted to embedding ...


Nous Research AI ▷ #research-papers (2 messages):

  • LLMs Reasoning without Prompting
  • Generative Agent Behavioral Simulations

Links mentioned:


OpenAI ▷ #ai-discussions (18 messages🔥):

  • Daily Theme Winner
  • API Usage Discussion
  • Model Options and Performance

OpenAI ▷ #gpt-4-discussions (3 messages):

  • High Temperature Performance
  • Beta Access to o1
  • Gaming Character Genshin Impact

OpenAI ▷ #prompt-engineering (8 messages🔥):

  • Using Delimiters in Prompts
  • Markdown for Clarity
  • Game Mechanics Understanding
  • Model Context Expectations

OpenAI ▷ #api-discussions (8 messages🔥):

  • Using Delimiters for Clarity
  • Markdown Formatting
  • Improving GPT's Understanding
  • Game Mechanics in GPT
  • Model Context and Labeling

Cohere ▷ #discussions (12 messages🔥):

  • API Key Issues
  • CORS Errors
  • Python Learning Projects

Links mentioned:


Cohere ▷ #questions (6 messages):

  • Account-based settings
  • Model training prompts
  • Bulgarian language datasets
  • Model tuning techniques
  • Contributing processes

Cohere ▷ #api-discussions (4 messages):

  • RAG chatbot issues
  • Cohere multi-modal embeddings
  • Rate limiting problems

Cohere ▷ #projects (4 messages):

  • Harmony Open-Source Project
  • Competition for LLM Matching Algorithms
  • Data Availability for Harmony
  • Natural Language Processing in Harmony
  • Discord Community for Harmony

Links mentioned:


Torchtune ▷ #general (7 messages):

  • Post-softmax Scores with sdpa/flex
  • Attention Score Calculation
  • Flex Attention Updates
  • Performance Benchmarking sdpa

Link mentioned: pytorch/torch/nn/attention/flex_attention.py at release/2.5 · pytorch/pytorch): Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/pytorch


Torchtune ▷ #dev (14 messages🔥):

  • Adaptive Batching Implementation
  • Improving DPO Loss Function
  • Standard vs. New Research Approaches
  • Server Boosts and Nitro Subscription
  • Code Structure and Modularity Concerns

Link mentioned: Add RPO, DPOP losses, add lambda_dpop to basic DPO loss by krammnic · Pull Request #2035 · pytorch/torchtune: Context What is the purpose of this PR? Is it to add a new feature fix a bug update tests and/or documentation other (please add here) Please link to any issues this PR addresses. Changelog W...


Torchtune ▷ #papers (2 messages):

  • SageAttention
  • Inference Gains

Link mentioned: GitHub - thu-ml/SageAttention: Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.: Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models. - thu-ml/SageAttention


tinygrad (George Hotz) ▷ #general (6 messages):

  • Tinygrad and Triton Integration
  • SASS Assembler Questions
  • FOSDEM AI DevRoom Presentation
  • Tinybox Hackathon Proposal

Link mentioned: FOSDEM 2025 - Low-Level AI Engineering & Hacking Dev Room: Explore the new "Low-Level AI Hacking & Engineering" Dev Room at FOSDEM, featuring open-source projects powering the AI industry. Submit a session or become a sponsor for this innovative...


tinygrad (George Hotz) ▷ #learn-tinygrad (1 messages):

  • int64 indexing
  • huge tensors

Modular (Mojo 🔥) ▷ #mojo (2 messages):

  • Async functions in Mojo
  • Mojo library repository

Modular (Mojo 🔥) ▷ #max (5 messages):

  • Moonshine ASR Model Performance
  • Mojo Program Observations
  • Max API vs ONNX Performance

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (2 messages):

  • Tencent Hunyuan Model
  • Bits and Bytes on MI300X

Link mentioned: tencent/Tencent-Hunyuan-Large · Hugging Face: no description found


OpenAccess AI Collective (axolotl) ▷ #community-showcase (1 messages):

volko76: Do we still need to prompt correctly ?
https://youtu.be/m3Izr0wNfQc


OpenAccess AI Collective (axolotl) ▷ #axolotl-help-bot (4 messages):

  • Axolotl Collab Notebooks
  • Continual Pretraining of LLaMA

Link mentioned: OpenAccess-AI-Collective/axolotl | Phorm AI Code Search): Understand code, faster.


DSPy ▷ #general (5 messages):

  • multimodal problems
  • vision language models
  • mmmu notebook

DSPy ▷ #examples (1 messages):

  • Semantic Router
  • Classification Tasks

Link mentioned: GitHub - aurelio-labs/semantic-router: Superfast AI decision making and intelligent processing of multi-modal data.: Superfast AI decision making and intelligent processing of multi-modal data. - aurelio-labs/semantic-router


LlamaIndex ▷ #blog (2 messages):

  • LLM-Native Resume Matching
  • Building AI Agents with LlamaIndex
  • Webinar on December 12

LlamaIndex ▷ #general (2 messages):

  • Extracting table data from PDFs
  • Applications for PDF data extraction

OpenInterpreter ▷ #general (4 messages):

  • New UI Feedback
  • Rate Limit Issues
  • Interpreter Design
  • Future UI Configurations

LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

  • Intel AMA
  • Hackathon Insights

LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

  • Registration Issues
  • Hackathon vs MOOC Registration

Mozilla AI ▷ #announcements (1 messages):

  • Refact.AI
  • Autonomous Agents
  • Live Demo
  • Tooling






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}