Frozen AI News archive

Moondream 2025.1.9: Structured Text, Enhanced OCR, Gaze Detection in a 2B Model

**Moondream** has released a new version that advances VRAM efficiency and adds structured output and gaze detection, marking a new frontier in vision model practicality. Discussions on Twitter highlighted advancements in reasoning models like **OpenAI's o1**, model distillation techniques, and new multimodal embedding models such as **vdr-2b-multi-v1** and **LLaVA-Mini**, which significantly reduce computational costs. Research on GANs and decentralized diffusion models showed improved stability and performance. Development tools like **MLX** and **vLLM** received updates for better portability and developer experience, while frameworks like **LangChain** and **Qdrant** enable intelligent data workflows. Company updates include new roles and team expansions at **GenmoAI**. *"Efficiency tricks are all you need."*

Canonical issue URL

AI News for 1/9/2025-1/10/2025. We checked 7 subreddits, 433 Twitters and 32 Discords (219 channels, and 2928 messages) for you. Estimated reading time saved (at 200wpm): 312 minutes. You can now tag @smol_ai for AINews discussions!

Moondream has been gaining a lot of attention for its small, light, fast, yet SOTA vision, and released a lovely new version yesterday that marks a new efficient frontier in VRAM usage (more practical than just param count):

image.png

It now also offers structured output and gaze detection, which allows creative redditors to come up with scripts like these:

image.png

In case you missed it, Vik also gave a talk about Moondream at the Best of 2024 in Vision Latent Space Live event:

https://www.youtube.com/watch?v=76EL7YVAwVo


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Models and Research

AI Tools and Development

Company Announcements and Updates

Datasets and Benchmarks

AI Ethics, Policy, and Society

Personal Updates and Announcements

Memes/Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Moondream 2b's Gaze Detection Creates Buzz

Theme 2. Transformers.js Brings LLMs In-browser with WebGPU

Theme 3. Biden's AI Chip Export Limits Stir Global Reaction

Theme 4. NVIDIA's Project Digits Promises AI Democratization

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. DALL-E Abandonment: OpenAI's Multimodal Struggles

Theme 2. Microsoft Envisions AI Agent Swarms in Organizations


AI Discord Recap

A summary of Summaries of Summaries by o1-mini-2024-09-12

Theme 1. AI Model Showdowns: PHI-4 Tops Microsoft and Beyond

Theme 2. AI Tools Face Off: Codeium, ComfyUI, and Cursor IDE

Theme 3. GPU Grievances and Kernel Calamities: Stable Diffusion on Linux

Theme 4. AI Community Buzz: Hackathons, Hiring, and Funding Frenzies

Theme 5. Advanced AI Techniques: Fine-Tuning, Decoding, and Regularization Woes


PART 1: High level Discord summaries

Stability.ai (Stable Diffusion) Discord


Unsloth AI (Daniel Han) Discord


Codeium (Windsurf) Discord


Cursor IDE Discord


Stackblitz (Bolt.new) Discord


aider (Paul Gauthier) Discord


Notebook LM Discord Discord


LM Studio Discord


OpenAI Discord


Interconnects (Nathan Lambert) Discord


Eleuther Discord


GPU MODE Discord


Nous Research AI Discord


Latent Space Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


Cohere Discord


tinygrad (George Hotz) Discord


Nomic.ai (GPT4All) Discord


LlamaIndex Discord


Modular (Mojo 🔥) Discord


LLM Agents (Berkeley MOOC) Discord


OpenInterpreter Discord


LAION Discord


DSPy Discord


AI21 Labs (Jamba) Discord


Torchtune Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Axolotl AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Stability.ai (Stable Diffusion) ▷ #general-chat (719 messages🔥🔥🔥):

GPU Compatibility with AI Models, Image to Video Generation Challenges, Discord Community Dynamics, UI/UX Preferences in AI Tools, Kernel Panic in Linux Systems

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (393 messages🔥🔥):

Unsloth updates, PHI-4 model fixes, Quantum models comparison, Adapters in fine-tuning, Chat templates in LLMs

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

Job search success, Funny reaction GIF

Link mentioned: Amogus6969 GIF - Amogus6969 - Discover & Share GIFs: Click to view the GIF


Unsloth AI (Daniel Han) ▷ #help (48 messages🔥):

Mathstral Model Status, Recommendations for AI Models, RAG as an Option, Finetuning with LORA, Error with Qwen2VL Model

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (4 messages):

DLSS for Language Models, Speculative Decoding


Codeium (Windsurf) ▷ #discussion (125 messages🔥🔥):

Self-hosted Codeium, Windsurf issues, Cascade Model discussion, Purchase of credits, User experiences with Codeium

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (140 messages🔥🔥):

Windsurf Installation, Cascade Chat Optimizations, Flow Credits Discrepancies, Agent Integration in Windsurf, User Experience Issues

Links mentioned:


Cursor IDE ▷ #general (246 messages🔥🔥):

Cursor IDE performance issues, Using Cursor Rules for better outputs, Challenges with Composer, Connecting with Cursor Developers, User experiences with Claude

Links mentioned:


Stackblitz (Bolt.new) ▷ #prompting (11 messages🔥):

Prompting Techniques, Payment System, Public Repos Feature, Sleep Schedule Impact

Links mentioned:


Stackblitz (Bolt.new) ▷ #discussions (211 messages🔥🔥):

Bolt Token Issues, PWA Support in Bolt, Supabase Migration Concerns, Netlify Performance Problems, Community Feedback and Features

Links mentioned:


aider (Paul Gauthier) ▷ #general (66 messages🔥🔥):

Aider User Experiences, Comparison of AI Models, Model Capabilities and Improvements, Coding Assistant Development, OpenAI and Gemini Models

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (61 messages🔥🔥):

Aider Configuration Issues, Using OpenAI Providers, DeepSeek Performance, Task Management in Aider, Handling Chat History with Aider

Links mentioned:


aider (Paul Gauthier) ▷ #links (1 messages):

Gemini 2.0 Flash Experimental, Voice mode on iOS, App development with AI assistance


Notebook LM Discord ▷ #use-cases (19 messages🔥):

Importing videos into NotebookLM, DeepResearch reports integration, Generating Mandarin podcasts, Quotation Mode for direct quotes, System prompts in NotebookLM Plus

Links mentioned:


Notebook LM Discord ▷ #general (94 messages🔥🔥):

NotebookLM functionality, Audio generation features, Workspace license issues, Language options in conversations, User experience with podcasts

Links mentioned:


LM Studio ▷ #general (66 messages🔥🔥):

LM Studio and API Connectivity, Model Loading Issues, Directory Structure for Models, New Qwen Chat Feature Announcement, LLM Applications and Development Trends

Links mentioned:


LM Studio ▷ #hardware-discussion (33 messages🔥):

AMD RX 7900XT performance, Sidecar graphics cards for MacBook Pro, Memory requirements for Llama 3.3, GPU activity monitoring tools, Benchmarking DIGITS arrival


OpenAI ▷ #ai-discussions (60 messages🔥🔥):

Model Versions and Testing, TensorFlow GPU Issues, Machine Learning Resources, Jupyter vs Python File Debugging, Community Concerns about AI Safety


OpenAI ▷ #gpt-4-discussions (7 messages):

GPT code handling, ChatGPT generating graphs


OpenAI ▷ #prompt-engineering (13 messages🔥):

Meta-Prompting Use Cases, Insights on Prompting, Investor Round for Hassabis


OpenAI ▷ #api-discussions (13 messages🔥):

Meta-Prompting, OpenAI Contributions, Investor Round, Prompt Creation


Interconnects (Nathan Lambert) ▷ #events (3 messages):

ICLR Event Attendance, Meeting Points and Descriptions


Interconnects (Nathan Lambert) ▷ #news (19 messages🔥):

rStar-Math Performance, Qwen Chat Launch, O1 vs GPT4o + MCTS Discussion, Challenges in Chinese ML Startups, EpiCoder Framework

Links mentioned:


Interconnects (Nathan Lambert) ▷ #other-papers (21 messages🔥):

NuminaMath dataset, Lead author background, Quality concerns in open data, High school math challenges, Business vs. coding in tech

Link mentioned: Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought: We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT....


Interconnects (Nathan Lambert) ▷ #ml-questions (11 messages🔥):

Complexity of Large-Scale Providers, Transformers vs MoEs, Performance Efficiency in Models


Interconnects (Nathan Lambert) ▷ #random (17 messages🔥):

Anthropic salon, Character shaping in AI models, Post-training processes, Imposter syndrome among AI professionals, Blogging and self-care in academia

Link mentioned: How difficult is AI alignment? | Anthropic Research Salon: At an Anthropic Research Salon event in San Francisco, four of our researchers—Alex Tamkin, Jan Leike, Amanda Askell and Josh Batson—discussed alignment scie...


Interconnects (Nathan Lambert) ▷ #reads (3 messages):

Efficient Deep Learning, Popup Issues


Interconnects (Nathan Lambert) ▷ #posts (14 messages🔥):

Open Source AI Costs, AI Policy Maker Reactions

Link mentioned: Tweet from Teortaxes▶️ (@teortaxesTex): @natolambert I agree on substance but why do you present this as some debunking? They say right there that GPU-hours*$/hr does not include their total capex, R&D expenses, or data gen.(and it's me...


Eleuther ▷ #general (33 messages🔥):

SmolLM Corpus Updates, Training Models Efficiently, Modal for Research, SciAgents Discussion, GPT-NeoX Framework

Links mentioned:


Eleuther ▷ #research (42 messages🔥):

Grokking phenomenon, Weight decay strategies, Auxiliary loss functions, Softmax and sigmoid applications, Attention mechanisms

Links mentioned:


Eleuther ▷ #gpt-neox-dev (6 messages):

Pretraining 7B Llama2 Style Model, Memory Usage Analysis for GPU Models, Testing 6.7B Model Configurations

Links mentioned:


GPU MODE ▷ #general (10 messages🔥):

NCU profile comparison, Scams in the community, Learning Triton/CUDA for small GPU setups, Distributed training alternatives, Accelerating LLM inference


GPU MODE ▷ #triton (8 messages🔥):

WGMMA Computation, Triton Implementations of Fused MLP, Profiling Triton Ops, Error in Tutorial Examples

Links mentioned:


GPU MODE ▷ #cuda (14 messages🔥):

CUDA Driver Importance, Memory Banking Lectures, CUDA Kernel Programming, Blackwell vs Hopper, File Upload Tips in Discord


GPU MODE ▷ #jobs (2 messages):

Nectar Social job openings, GPU consultancy hiring


GPU MODE ▷ #beginner (3 messages):

Installing CUDA on Ubuntu, Getting started with MacBook, Alternatives to NVIDIA GPU

Link mentioned: CUDA Installation Guide for Linux: no description found


GPU MODE ▷ #off-topic (1 messages):

kashimoo: my gf says i sleep talk about CUDA 😭


GPU MODE ▷ #rocm (24 messages🔥):

GPU Occupancy, MI210 Performance Analysis, RX 7900XTX Computations, CDNA Architecture Insights, Kernel Launch Dynamics

Link mentioned: Optimizing GPU occupancy and resource usage with large thread groups: Sebastian Aaltonen, co-founder of Second Order Ltd, talks about how to optimize GPU occupancy and resource usage of compute shaders that use large thread groups.


GPU MODE ▷ #self-promotion (1 messages):

MicroDiT replication, Architectural improvements with DCAE, MMDIT for prompt adherence, Compute grants for experiments

Link mentioned: Tweet from sway (@SwayStar123): MicroDiT replication is complete.Download weights here: https://huggingface.co/SwayStar123/MicroDiT/blob/main/no_cfg/microdit_model_epoch_19.ptInference script here: https://github.com/SwayStar123/mic...


GPU MODE ▷ #🍿 (2 messages):

Alpha Competition, Softmax Kernel Performance


GPU MODE ▷ #thunderkittens (3 messages):

ThunderKittens repository, Collaboration on kernel development, CPP harness usage

Link mentioned: ThunderKittens/tests/python at main · HazyResearch/ThunderKittens: Tile primitives for speedy kernels. Contribute to HazyResearch/ThunderKittens development by creating an account on GitHub.


GPU MODE ▷ #arc-agi-2 (7 messages):

ARC Prize evolution, Rejection Sampling Baseline Experiment, Exploring Text-Domain for ARC Tasks, Meta CoT paper findings, Positional Encodings in Models

Links mentioned:


Nous Research AI ▷ #general (47 messages🔥):

Contributing GPU to Training, Open Sourcing DisTrO, DeepSeek V3 Performance Comparison, Hermes Model Censorship, Cursor vs WebStorm/PyCharm


Nous Research AI ▷ #ask-about-llms (2 messages):

Reducing Memory Usage, Open Source Function Calling Models, Qwen2.5-32B-Instruct-AWQ, Function Calling Benchmarks


Nous Research AI ▷ #research-papers (3 messages):

Research Ideas, Carson's Personal Site, Forefront.ai, Simple AI Software

Link mentioned: Carson Poole's Personal Site: no description found


Nous Research AI ▷ #interesting-links (11 messages🔥):

Microsoft's rStar-Math, Qwen 7B AIME performance, LLMs and reasoning capabilities, Math usefulness, Trustworthiness of LLMs in math

Link mentioned: Tweet from Alex Volkov (Thursd/AI) (@altryne): Ugh guys... Microsoft just made Qwen 7B solve AIME at the level of o1 😵‍💫 They also showed that with their MCTS driver process, there was self-reflection capability like with reasoning models. Will ...


Nous Research AI ▷ #research-papers (3 messages):

Carson Poole's Research Ideas, Contact and Background of Carson Poole

Link mentioned: Carson Poole's Personal Site: no description found


Latent Space ▷ #ai-general-chat (47 messages🔥):

Salesforce AI Hiring Freeze, OpenAI Product Updates, Anthropic Funding News, AI Career Opportunities, Google DeepMind Mergers

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

AI Agent Hackathon, OpenRouter API credits, Live Agent Studio, Voiceflow sponsorship, n8n prize increase

Link mentioned: oTTomator: no description found


OpenRouter (Alex Atallah) ▷ #general (46 messages🔥):

OpenRouter UI Performance, Gemini Flash and API Issues, O1 Response Format, API Access Requests, Hanami Usage Experience


Perplexity AI ▷ #announcements (1 messages):

CSV file downloads, Table responses


Perplexity AI ▷ #general (33 messages🔥):

Youzu.ai for Interior Design, Perplexity Issues and Bugs, Collaboration Proposal in Discord, Translation Challenges with Perplexity, Product Manager from Ecosia Seeking Partnership

Links mentioned:


Perplexity AI ▷ #sharing (6 messages):

Toyota's rocket exploration, Upcoming video game releases, IndyCar driver statistics, Average lifespan of Spaniards, NVIDIA's home supercomputer


Perplexity AI ▷ #pplx-api (3 messages):

Korean Language API Usage, Other Language Models


Cohere ▷ #discussions (2 messages):

North launch, AI workspace, Productivity tools, Cohere vs Microsoft Copilot, Cohere vs Google Vertex AI

Links mentioned:


Cohere ▷ #questions (7 messages):

Command R+ models, Upgrading embeddings, Classification model limits, Alignment Evals Hackathon, Eval and Interp tutorials


Cohere ▷ #api-discussions (26 messages🔥):

Cohere LLM API Recursive Loop Issue, Improving Model Generations, Expanding Token Limits, Rolling Chat History Technique, API Rate Limit Errors


Cohere ▷ #projects (2 messages):

Channel Posting Rules


tinygrad (George Hotz) ▷ #general (18 messages🔥):

Pull Request #8505 Retest, LLVM JIT and Autogen Integration, Function Signature Stability in LLVM, Bounty Payments, Testing Compatibility with LLVM Versions

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

Blog Post on TinyGrad, Initializing Layers on Specific Devices

Link mentioned: TinyGrad Codebase Explained-ish: A detailed-ish explanation of TinyGrad’s repository structure and key files


Nomic.ai (GPT4All) ▷ #general (22 messages🔥):

Comparing Llama.cpp and GPT4All, Performance variations in models, Troubleshooting Chat Templates, Recommendations for roleplay models, Deployment of modernbert

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

GitHub HQ Event, Agentic Document Workflows, AI Agents Debugging, Fast Inference Systems, LlamaIndex Workflows


LlamaIndex ▷ #general (18 messages🔥):

Ollama Update, App Deployment for Email Restriction, Vector DB Indexing, Local TEI Server Support, QueryFusionRetriever Error

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (18 messages🔥):

Rust's Syntax and Type Bounds, Overload Resolution in Mojo, Quantum Computing Libraries in Mojo, MAX and Quantum Programming, Quojo Library in Mojo

Link mentioned: GitHub - Deftioon/Quojo: A Quantum Computing Machine written in Mojo: A Quantum Computing Machine written in Mojo. Contribute to Deftioon/Quojo development by creating an account on GitHub.


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Hackathon results timeline, Judges' feedback


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (6 messages):

Google Form Editing, Email Workaround for Forms, Twitter Account Deactivation, Certificate Qualification


OpenInterpreter ▷ #general (7 messages):

OpenInterpreter 1.0, Model performance, Custom instructions, Python code execution


LAION ▷ #general (5 messages):

TruLie dataset, Image-to-3D advancements, Gaussian splats, Chirpy3D, World Models

Links mentioned:


LAION ▷ #research (1 messages):

rom1504: Is there any good open tool registry for building agents ?


DSPy ▷ #general (4 messages):

Improving Chain of Thought (COT), Building Your Own Evaluation, DSPy and Knowledge Banks, Cultural Anthropology and Technology

Link mentioned: Home: Writing about technology, culture, media, data, and the ways they interact.


AI21 Labs (Jamba) ▷ #general-chat (3 messages):

Python app with Jamba, AI code generation, PHP coding reliance, Jamba connection experience


Torchtune ▷ #general (1 messages):

jovial_lynx_74856: Anyone here tried finetuning ModernBERT?


Torchtune ▷ #jobs (1 messages):

Nectar Social hiring, AI startup roles, Referral bounties






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}