Frozen AI News archive

DeepSeek #1 on US App Store, Nvidia stock tanks -17%

**DeepSeek** has made a significant cultural impact by hitting mainstream news unexpectedly in 2025. The **DeepSeek-R1** model features a massive **671B parameter MoE architecture** and demonstrates **chain-of-thought (CoT)** capabilities comparable to **OpenAI's o1** at a lower cost. The **DeepSeek V3** model trains a **236B parameter model 42% faster** than its predecessor using **fp8 precision**. The **Qwen2.5** multimodal models support images and videos with sizes ranging from **3B to 72B parameters**, featuring strong vision and agentic capabilities. **LangChain** and **LangGraph** integration enable AI chatbots with memory and tool use, including applications like the **DeFi Agent**. Discussions highlight **NVIDIA's** role in hardware acceleration, with concerns about stock drops due to **DeepSeek's** efficiency and market fears. The compute demand is expected to rise despite efficiency gains, driven by inference scaling and MoE design improvements.

Canonical issue URL

AI News for 1/24/2025-1/27/2025. We checked 7 subreddits, 433 Twitters and 34 Discords (225 channels, and 11316 messages) for you. Estimated reading time saved (at 200wpm): 1229 minutes. You can now tag @smol_ai for AINews discussions!

We really try to keep news reporting technical here, but on rare occasions, mainstream/nontechnical news is so significant that it gets through.

This is one of those days.

/r/LocalLlama:

image.png

and sama:

image.png

Ultimately much of the discussion is very unhelpful that looks like some version of this

image.png

and we are reporting mostly on the cultural moment of DeepSeek hitting mainstream news which was not ever on our bingo card for 2025.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Releases and Enhancements

Compute and Hardware

AI Competition and Market Reactions

AI Applications and Use Cases

Technical Discussions and Innovations

AI Business and Market Reactions


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek is #1 on U.S. App Store: Market Implications

Theme 2. How Deepseek Reduces Costs by 95-97%

Theme 3. New Tool for Local LLM Compatibility: 'Can You Run It?'

Theme 4. Qwen 3.0 MOE: Emerging Reasoning Model

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. Nvidia Stock Volatility: Impact of DeepSeek's Efficient Model

Theme 2. DeepSeek R1's Coding Efficiency vs OpenAI O3

Theme 3. Debates on DeepSeek vs ChatGPT: A Censorship Perspective


AI Discord Recap

A summary of Summaries of Summaries by o1-preview-2024-09-12

Theme 1. DeepSeek R1 Models Upending the AI Landscape

Theme 2. Qwen2.5 Models Breaking Context Barriers

Theme 3. AI Tools Advance, Integrating Into Developer Workflows

Theme 4. OpenRouter Expands with New Models and Providers

Theme 5. Global AI Policies and Investments Heating Up the Competition


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Cursor IDE Discord


Codeium (Windsurf) Discord


Perplexity AI Discord


Nous Research AI Discord


OpenAI Discord


aider (Paul Gauthier) Discord


LM Studio Discord


OpenRouter (Alex Atallah) Discord


Yannick Kilcher Discord


Interconnects (Nathan Lambert) Discord


Latent Space Discord


Eleuther Discord


Stackblitz (Bolt.new) Discord


MCP (Glama) Discord


Notebook LM Discord Discord


Stability.ai (Stable Diffusion) Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


Cohere Discord


Nomic.ai (GPT4All) Discord


LLM Agents (Berkeley MOOC) Discord


tinygrad (George Hotz) Discord


Torchtune Discord


OpenInterpreter Discord


LAION Discord


DSPy Discord


Axolotl AI Discord


Gorilla LLM (Berkeley Function Calling) Discord


MLOps @Chipro Discord


Mozilla AI Discord


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (1123 messages🔥🔥🔥):

Model Fine-Tuning, Dynamic Quantization, Hardware Requirements, Agentic AI, Training Datasets

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (20 messages🔥):

NLP Course Completion, SmoLlm Fine-Tuning, Ollama Temperature Settings, AI Generated Text Detection, DeepSeek R1 vs OpenAI O1

Link mentioned: DeepSeek R1 trimmed to 1.58bit 131 GB with unclothe #ai: DeepSeek-R1 has been making waves recently by rivaling OpenAI's O1 reasoning model while being fully open-source. We explored how to enable more local users ...


Unsloth AI (Daniel Han) ▷ #help (371 messages🔥🔥):

Unsloth Errors, Dataset Formatting for Fine-Tuning, Text Completion and Chatbot Datasets, DeepSeek R1 Deployment, Model Deployment on Different Hardware

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (61 messages🔥🔥):

Model Training Techniques, AugmenToolKit Usage, Code Vulnerability Review, Loss Graph Interpretation

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (49 messages🔥):

Fine-tuning LLMs, LLaMA 4 Expectations, Reinforcement Learning Enhancements, Vector Database and Quantization, DeepSeek and Reasoning Models

Link mentioned: GitHub - SalesforceAIResearch/perfcodegen: Contribute to SalesforceAIResearch/perfcodegen development by creating an account on GitHub.


Cursor IDE ▷ #general (762 messages🔥🔥🔥):

Cursor IDE Performance, Comparison of DeepSeek and Claude, Codebase Indexing, RAG Implementation, User Experience with Models

Links mentioned:


Codeium (Windsurf) ▷ #announcements (1 messages):

Windsurf 1.2.2 Release, Cascade's Memory Improvements, Web Search Capabilities

Link mentioned: Windsurf Editor Changelogs | Windsurf Editor and Codeium extensions: Latest updates and changes for the Windsurf Editor.


Codeium (Windsurf) ▷ #discussion (252 messages🔥🔥):

Windsurf performance issues, Changes in free and pro plan credits, Deepseek model integration expectations, User experiences with Cascade, Extension compatibility issues

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (505 messages🔥🔥🔥):

Windsurf Updates and Bugs, Cascade Performance, User Experiences with DeepSeek, Support and Documentation, Git Usage in Development

Links mentioned:


Perplexity AI ▷ #general (755 messages🔥🔥🔥):

Perplexity Pro changes, R1 model introduction, User feedback on AI models, Privacy concerns with DeepSeek, Comparison with other AI services

Links mentioned:


Perplexity AI ▷ #sharing (26 messages🔥):

AI advancements, Financial trends, Earthquake mechanics, Action-adventure films, Startup insights


Perplexity AI ▷ #pplx-api (4 messages):

Sonar JSON Response Format, API for LinkedIn URLs, Response Format Issues, Sonar vs Sonar-Pro


Nous Research AI ▷ #announcements (1 messages):

Nous Psyche, Cooperative AI Training, Open Source Models, Heterogeneous Compute

Links mentioned:


Nous Research AI ▷ #general (681 messages🔥🔥🔥):

Nous Psyche Announcement, Testnet Participation, Distributed Training and Reputation Systems, Scam Tokens, Collaborative Open Source Development

Links mentioned:


Nous Research AI ▷ #ask-about-llms (14 messages🔥):

R1 Distillation Models, Llama 3 performance issues, Image Captioning with DeepSeek, Building AI Assistants, Fine-tuning for performance

Links mentioned:


Nous Research AI ▷ #research-papers (2 messages):

Human-like LLM enhancements, Crisis communication strategies

Links mentioned:


Nous Research AI ▷ #interesting-links (2 messages):

LLM Live2D Assistant, Qwen2.5-VL model, OCR capabilities

Links mentioned:


Nous Research AI ▷ #research-papers (2 messages):

Human-Like Large Language Models, Crisis Communication Strategies

Links mentioned:


Nous Research AI ▷ #rag-dataset (1 messages):

voltamachine: neat


OpenAI ▷ #annnouncements (1 messages):

ChatGPT Canvas Update, OpenAI o1 Integration, HTML & React Rendering, Desktop App Features


OpenAI ▷ #ai-discussions (537 messages🔥🔥🔥):

DeepSeek vs OpenAI models, Impact of DeepSeek on stock market, AI competition in tech industry, Performance comparisons of LLMs, User experiences with AI models

Links mentioned:


OpenAI ▷ #gpt-4-discussions (34 messages🔥):

O3 Mini Release Date, O3 Mini Features, Tokenization in Tiktoken, Gemini vs GPT, Scraping URLs from Word Files

Link mentioned: Tweet from Sam Altman (@sama): ok we heard y’all.*plus tier will get 100 o3-mini queries per DAY (!)we will bring operator to plus tier as soon as we canour next agent will launch with availability in the plus tierenjoy 😊Quoting...


OpenAI ▷ #prompt-engineering (12 messages🔥):

LangChain ChatPromptTemplate, User Feedback on Formatting, Complex Prompts & Vector Stores


OpenAI ▷ #api-discussions (12 messages🔥):

Langchain ChatPromptTemplate, Vector Store Integration


aider (Paul Gauthier) ▷ #general (449 messages🔥🔥🔥):

DeepSeek API Issues, Inference Provider Profitability, Comparison of R1 and O1 Models, New AI Model Releases, User Experiences with Aider

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (120 messages🔥🔥):

Deepseek API Issues, Aider Functionality with Architect Mode, Model Pairing and Switching, Token Usage in Aider, Using Aider with Rust

Links mentioned:


aider (Paul Gauthier) ▷ #links (3 messages):

CodeGate integration with Aider, Comparative AI tools for web apps, Aider's functionality

Links mentioned:


LM Studio ▷ #general (413 messages🔥🔥🔥):

LM Studio Model Comparisons, DeepSeek R1 Distill Models, Using AI for Coding, Benchmarking AI Models, Chatter UI Setup Issues

Links mentioned:


LM Studio ▷ #hardware-discussion (156 messages🔥🔥):

Hardware for Running DeepSeek Models, Using Multiple GPUs with LM Studio, Performance of LLMs on Apple M3 Max, Ideal GPUs for Coding Tasks, DDR5 Memory and AI Workloads

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (4 messages):

Liquid AI joins OpenRouter, Nitro DeepSeek R1, Amazon Nova models issue

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (535 messages🔥🔥🔥):

DeepSeek Model Performance, OpenRouter API Issues, Model Suggestions and Submissions, BYOK Integration, Current State of DeepSeek Provider

Links mentioned:


Yannick Kilcher ▷ #general (436 messages🔥🔥🔥):

AI Research and Development, Open Source Models, Janus Series Release, History of Internet and Technology, Federated Learning

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (39 messages🔥):

GPRO and PPO, Deepseek papers, Qwen2.5-VL model, Janus-Pro model, Friston podcast

Links mentioned:


Yannick Kilcher ▷ #agents (17 messages🔥):

Natural Language to DSL Code Resources, PydanticAI Framework, Structured Output for Generative AI, Workout Logging App Use Case, DSL vs JSON Discussion

Links mentioned:


Yannick Kilcher ▷ #ml-news (45 messages🔥):

DeepSeek AI Model Launch, Qwen2.5-VL Model Announcement, Mistral's IPO Plans, AI and Economic Impact, Public Perception of AI Governance

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (206 messages🔥🔥):

DeepSeek Model Updates, Qwen 2.5-VL Launch, AI Company Strategies, NVIDIA Market Position, Edge AI Discussion

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (16 messages🔥):

Self-Play Paradigm, Scaling Synthetic Data Pipelines, Role of Self-Determination Theory, Claims about Deepseek, Critique of Media Reporting

Link mentioned: Tweet from finbarr (@finbarrtimbers): co-signQuoting doomslide (@doomslide) after reading the papers again and playing with R1 i've come to the conclusion that the extremely predictable next jump in capability will be (cleverly optimi...


Interconnects (Nathan Lambert) ▷ #ml-drama (38 messages🔥):

Gary Marcus's Opinions, Nous Psyche Announcement, Political Perspectives on Nation States, AI Breakthrough Narratives, Academic Perspectives on AI

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (177 messages🔥🔥):

DeepSeek's Rise, Market Reactions to AI Models, Qwen Model Launch, Investor Sentiment, Industry Disruptions

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (23 messages🔥):

Deepseek Performance, Scale.ai Concerns, Chinese Tech Commentary, Fake Accounts on Social Media, Cultural Revolution Impact on Tech Founders

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (4 messages):

REINFORCE acronym, Writing RLHF book, Open-Instruct integration with vLLM, OpenRLHF framework maintenance


Interconnects (Nathan Lambert) ▷ #rlhf (4 messages):

Tulu3 Paper Analysis, Pref Tuning Challenges, Anticipation for Tulu4


Interconnects (Nathan Lambert) ▷ #cv (1 messages):

the_real_jrb: It's here! Qwen2.5-VL. https://qwenlm.github.io/blog/qwen2.5-vl/


Interconnects (Nathan Lambert) ▷ #reads (15 messages🔥):

DeepSeek R1 Release, Market Reaction to DeepSeek, AIW Problem Variations, John Schulman's Commentary, Jay Alammar's Analysis

Links mentioned:


Interconnects (Nathan Lambert) ▷ #lectures-and-projects (4 messages):

Job Board Launch, Channel Inappropriateness


Interconnects (Nathan Lambert) ▷ #posts (6 messages):

Deepseek Understanding, Chatbot Formatting, Community Engagement


Interconnects (Nathan Lambert) ▷ #policy (15 messages🔥):

China's New AI Policy, US Industrial Policy, Great Power Competition in AI, CHIPS Act, Jones Act and Defense Manufacturing

Link mentioned: Tweet from Ray Wang (@rwang07): China's New AI Industry Development Action Plan (中国银行支持人工智能产业链发展行动方案) Will Provide 1 trillion yuan ($ 137 billion) to support its AI industry over the next five years 🇺🇸🇨🇳This might be the mos...


Latent Space ▷ #ai-general-chat (233 messages🔥🔥):

DeepSeek R1 developments, Qwen2.5-VL release, Operator functionality, Prompt engineering tools, Reasoning models applications

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

swyxio: new pod! https://x.com/latentspacepod/status/1883354909367787565


Latent Space ▷ #ai-in-action-club (193 messages🔥🔥):

Model Context Protocol (MCP), MCP tools integration, Transcription and documentation, Obsidian integration, Server capabilities and implementations

Links mentioned:


Eleuther ▷ #general (97 messages🔥🔥):

Layer Convergence Bias, Causally Regularized Tokenization, DeepSeek Model Discussion, R1 Training Costs, GRPO Implementation Challenges

Links mentioned:


Eleuther ▷ #research (311 messages🔥🔥):

GRPO Implementation Details, AlphaZero Evolution, Empowerment in AI, Reinforcement Learning Challenges, Experience Replay in RL

Links mentioned:


Eleuther ▷ #scaling-laws (2 messages):

Chinchilla library, LLM scaling laws, 20-tokens-per-parameter heuristic

Link mentioned: chinchilla/examples/llm/main.ipynb at master · kyo-takano/chinchilla: A toolkit for scaling law research ⚖. Contribute to kyo-takano/chinchilla development by creating an account on GitHub.


Eleuther ▷ #interpretability-general (3 messages):

Verified Reasoning in Training, Mechanisms of Model Learning, Interpretability in Fine-Tuning, Insights from Model Weights


Eleuther ▷ #lm-thunderdome (4 messages):

scbench, zeroSCROLLS, longbench


Eleuther ▷ #multimodal-general (2 messages):

Multimodal Channel Guidelines, Community Project Collaboration


Stackblitz (Bolt.new) ▷ #announcements (1 messages):

System prompts customization, Optimizing Bolt's behavior

Link mentioned: Tweet from bolt.new (@boltdotnew): You can now set up a system prompt, per project and globally!💡Put your favorite libs & techniques there so Bolt always uses them.This heavily requested feature allows you to optimize Bolt's behav...


Stackblitz (Bolt.new) ▷ #prompting (7 messages):

Project Structuring Challenges, Component Splitting Strategy, Utilizing Guidelines for Stability, Learning from Past Projects, Tracking Project Changes

Links mentioned:


Stackblitz (Bolt.new) ▷ #discussions (304 messages🔥🔥):

Error Handling in Bolt, Billing and Token Limits, Implementing User Roles, Deployment with Netlify, Connecting GitHub to Bolt

Links mentioned:


MCP (Glama) ▷ #general (233 messages🔥🔥):

MCP Client Issues, Server Configurations, Voice Chat Integrations, Open Source Tooling, Kubernetes Integrations

Links mentioned:


MCP (Glama) ▷ #showcase (10 messages🔥):

MCP Variance Log Tool, KoboldCPP-MCP Server, Notmuch Email Integration, MCP Inception Server, Shopify MCP Server

Links mentioned:


Notebook LM Discord ▷ #use-cases (10 messages🔥):

HeyGen avatars, ElevenLabs voice options, Podcasting with NotebookLM, Mixing HeyGen and MiniMax, NotebookLM note limits


Notebook LM Discord ▷ #general (226 messages🔥🔥):

NotebookLM usability issues, Audio overview generation delay, PDF source visibility problems, Language settings in NotebookLM, User roles and permissions

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (223 messages🔥🔥):

Hunyuan-video model, Kling AI quality, AI image generation setups, Stable Diffusion RAM requirements, Deepseek model limitations

Links mentioned:


GPU MODE ▷ #general (20 messages🔥):

Flash Infer Talk Questions, Support for Alternative Attention Methods, Deepseek Events, NCSA Hackathon Participation, Distributed Training Stacks

Link mentioned: Open Hackathons: no description found


GPU MODE ▷ #triton (1 messages):

Tensor Manipulation in Triton, Inline Assembly in Triton


GPU MODE ▷ #cuda (76 messages🔥🔥):

PTX ASM Segfault Issues, CUDA Kernel Loading Errors, DeepSeek Discussion, CUDA Versions and Compatibility, NCCL Timeout Debugging

Links mentioned:


GPU MODE ▷ #torch (13 messages🔥):

NCCL timeouts debugging, Linear Warmup in Learning Rates, Torch Inductor Internals, Vision-based model optimizations, Fused CUDA kernels example

Links mentioned:


GPU MODE ▷ #announcements (1 messages):

Adam Paszke, Mosaic GPU, GPU MODE community, GPU programming

Link mentioned: GPU MODE: A GPU reading group and community https://discord.gg/gpumodeSupplementary content here https://github.com/gpu-modeCreated by Mark Saroufim and Andreas Köpf


GPU MODE ▷ #cool-links (2 messages):

TinyZero, Open R1

Links mentioned:


GPU MODE ▷ #jobs (2 messages):

Atomic Semi Careers, Hinge Health Job Opening

Link mentioned: Careers: no description found


GPU MODE ▷ #beginner (5 messages):

High Performance Computing in ML, Basics of Parallel Computing, Understanding Neural Networks, Self Implementation of SVM, Learning Path for Practical Skills


GPU MODE ▷ #pmpp-book (4 messages):

Tiled Matrix Multiplication Issues, Floating Point Type Mismatch, Dummy Matrix Declaration, Result Comparison Code


GPU MODE ▷ #youtube-recordings (1 messages):

YouTube Recordings


GPU MODE ▷ #jax (3 messages):

Emulation in Torch, JAX fp8 support on Nvidia GPUs

Link mentioned: Why can JAX run fp8 on Nvidia GPUs with sm < 89? · jax-ml/jax · Discussion #26077: fp8 has hardware support only on GPUs with sm >= 89, such as RTX 4090 or A100. I've seen people trying to run it in PyTorch (e.g., this script) on older GPUs and getting errors. But JAX can act...


GPU MODE ▷ #lecture-qa (11 messages🔥):

Mosaic Layout System, TiledLayout Comments, SMEM to Registers Transfer, IR Generation Flags in Mosaic

Links mentioned:


GPU MODE ▷ #bitnet (1 messages):

Tile Lang, BitBLAS repo, Backward kernels


GPU MODE ▷ #liger-kernel (1 messages):

dpo loss, simpo loss, liger with trl, ligerdpo trainer


GPU MODE ▷ #self-promotion (1 messages):

mobicham: https://x.com/Mobius_Labs/status/1883951887965393301


GPU MODE ▷ #thunderkittens (3 messages):

WGMMA Instructions, Pointer Math in PTX ISA, Memory Handling Strategies


GPU MODE ▷ #arc-agi-2 (44 messages🔥):

Polynomial Equations PR, Maze Task Proposal, FSDP Support in Tiny-GRPO, Family Relationships Dataset, GSM8K Templates

Links mentioned:


Modular (Mojo 🔥) ▷ #general (5 messages):

Mojo Documentation, GPU Package API

Link mentioned: mojo/docs/changelog.md at nightly · modular/mojo: The Mojo Programming Language. Contribute to modular/mojo development by creating an account on GitHub.


Modular (Mojo 🔥) ▷ #mojo (91 messages🔥🔥):

Mojo CSS Struct, List and Representable Trait Issues, Unsafe Pointers and Object Identity, Function Pointer FFI in Mojo, Documentation Downtime

Links mentioned:


LlamaIndex ▷ #blog (4 messages):

Multi-agent workflows, Document research agents, Automation in travel insurance claims, LlamaIndex integrations, DeepSeek API

Link mentioned: DeepSeek - LlamaIndex: no description found


LlamaIndex ▷ #general (74 messages🔥🔥):

Access to LlamaIndex, LLM.complete kwargs usage, Evaluators in LlamaIndex, Local RAG Implementation, Documentation issues

Links mentioned:


Cohere ▷ #discussions (34 messages🔥):

Cohere Legal Regulations, Cohere UI Feedback, Community Engagement, GitHub as Collaboration Tool

Links mentioned:


Cohere ▷ #cmd-r-bot (34 messages🔥):

Cohere Documentation, Reverse Planning, Cohere LLM Usage, Cohere Platform Overview, TTS and STT Capabilities


Cohere ▷ #projects (2 messages):

App Check-In, Direct Messages


Nomic.ai (GPT4All) ▷ #general (50 messages🔥):

Open Source Image Analysis Models, DeepSeek Model Issues, Running Models Locally, Document Analysis Tools, DeepSeek R1 Availability

Link mentioned: unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF · Hugging Face: no description found


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

Advanced LLM Agents MOOC, Livestream Schedule, Course Website, Enrollment Info, Course Completion Certificate


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (47 messages🔥):

Certificate Distribution, MOOC Enrollment Confirmation, Course Time Zone Participation, In-Person Attendance, Hackathon Participation

Link mentioned: Large Language Model Agents MOOC: MOOC, Fall 2024


LLM Agents (Berkeley MOOC) ▷ #mooc-readings-discussion (1 messages):

interdimensionalbeing_: https://substack.com/home/post/p-154577981


tinygrad (George Hotz) ▷ #general (14 messages🔥):

Gradient Calculation Confusion, STRIDE vs FLIP Discussion, Meeting #55 Agenda, Asset Fetching Issues in TinyChat, RISC Architecture Inquiry


tinygrad (George Hotz) ▷ #learn-tinygrad (18 messages🔥):

BobNet Clarification, Formatting Tools in Tinygrad, Tinygrad Valorization Plans, Tinygrad Learning Resources, Tensor UOp Operations

Links mentioned:


Torchtune ▷ #general (5 messages):

GPU Efficiency, WSL Virtualization, Regex Misformatting, EBNF Grammars


Torchtune ▷ #dev (6 messages):

Federated Learning with Torchtune, Selective Application of Optimizer Hooks, Using torch distributed primitives, Managing Optimizer States


Torchtune ▷ #papers (10 messages🔥):

Deepseek, Nvidia Stock Experiences, Market Sentiment, Investment Strategies, Comparison of AI Models

Link mentioned: Janus/janus_pro_tech_report.pdf at main · deepseek-ai/Janus: Janus-Series: Unified Multimodal Understanding and Generation Models - deepseek-ai/Janus


OpenInterpreter ▷ #general (12 messages🔥):

Project Development Status, Website Updates, Python Interpreter Functionality, Latest Development Version, User Feedback and Suggestions

Link mentioned: GitHub - OpenInterpreter/open-interpreter: A natural language interface for computers: A natural language interface for computers. Contribute to OpenInterpreter/open-interpreter development by creating an account on GitHub.


OpenInterpreter ▷ #O1 (2 messages):

Deepseek_r1, API Errors

Link mentioned: <a href="https://api.deepseek.com"```">no title found: no description found


OpenInterpreter ▷ #ai-content (4 messages):

DeepSeek models, Open Interpreter local setup, AI terminal app development, Vision model discussions

Links mentioned:


LAION ▷ #general (12 messages🔥):

DeepSeek R1 performance, Audio augmentation tools, Pipeline testing

Links mentioned:


LAION ▷ #research (2 messages):

DeepSeek R1, AIW Versions Comparison, Benchmarking Performance

Link mentioned: Tweet from Jenia Jitsev 🏳️‍🌈 🇺🇦 🇮🇱 (@JJitsev): (Yet) another tale of Rise and Fall: DeepSeek R1 is claimed to match o1/o1-preview on olympiad level math & coding problems. Can it handle versions of AIW problems that reveal generalization & basic ...


DSPy ▷ #general (8 messages🔥):

GitHub issue spamming, Natural language vs programming, dspy + deepseek optimization, pypi version update


Axolotl AI ▷ #general (7 messages):

deepseek algorithm, H200 vs 5090s, RL framework support


Gorilla LLM (Berkeley Function Calling) ▷ #leaderboard (3 messages):

System prompts for leaderboard models, Gorilla GitHub repository

Link mentioned: gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/constant.py at main · ShishirPatil/gorilla: Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls) - ShishirPatil/gorilla


MLOps @Chipro ▷ #events (1 messages):

2025 Crystal Ball panel discussion, Real-time data processing, AI and data streaming technologies, Industry leaders and insights

Link mentioned: 2025 Crystal Ball: Real-Time Data and AI, Tue, Jan 28, 2025, 9:00 AM | Meetup: AboutLook into the future of data streaming and AI at 2025 Crystal Ball: Real-Time Data and AI panel. Without a doubt, AI is to shape our future in the years to come.


Mozilla AI ▷ #announcements (1 messages):

Paper Reading Club, Discord Events



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}