Frozen AI News archive

OLMo 2 - new SOTA Fully Open LLM

**AI2** has updated **OLMo-2** to roughly **Llama 3.1 8B** equivalent, training with **5T tokens** and using learning rate annealing and new high-quality data (Dolmino). They credit **Tülu 3** and its "Reinforcement Learning with Verifiable Rewards" approach. On Reddit, **Qwen2.5-72B instruct** model shows near lossless performance with **AutoRound 4-bit quantization**, available on **HuggingFace** in 4-bit and 2-bit versions, with discussions on **MMLU** benchmark and quantization-aware training. **HuggingFace** released **SmolVLM**, a **2B parameter** vision-language model running efficiently on consumer GPUs, supporting fine-tuning on Google Colab and demonstrating strong OCR capabilities with adjustable resolution and quantization options.

Canonical issue URL

AI News for 11/26/2024-11/27/2024. We checked 7 subreddits, 433 Twitters and 29 Discords (197 channels, and 2528 messages) for you. Estimated reading time saved (at 200wpm): 318 minutes. You can now tag @smol_ai for AINews discussions!

AI2 is notable for having fully open models - not just open weights, but open data, code, and everything else. We last covered OLMo 1 in Feb and OpenELM in April. Now it would see that AI2 have updated OLMo-2 to roughly Llama 3.1 8B equivalent.

image.png

They have trained with 5T tokens, particularly using learning rate annealing and introducing new, high-quality data (Dolmino) at the end of pretraining. A full technical report is pending soon so we don't know much else, but the post-training gives credit to Tülu 3, using "Reinforcement Learning with Verifiable Rewards" (paper here, tweet here)) which they just announced last week (with open datasets of course.

image.png


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

TO BE COMPLETED


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. AutoRound 4-bit Quantization: Lossless Performance with Qwen2.5-72B

Theme 2. SmolVLM: 2B Parameter Vision Model Running on Consumer Hardware

Theme 3. MLX LM 0.20.1 Matches llama.cpp Flash Attention Speed

Theme 4. MoDEM: Routing Between Domain-Specialized Models Outperforms Generalists

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

Theme 1. Anthropic Launches Model Context Protocol for Claude

Theme 2. Major ChatGPT & Claude Service Disruptions

Theme 3. MIT PhD's Open-Source LLM Training Series

Theme 4. Qwen2VL-Flux: New Open-Source Image Model


AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1: AI Model Updates and Releases

Theme 2: Technical Issues and Performance Enhancements

Theme 3: Community Concerns and Feedback

Theme 4: Advancements in AI Applications

Theme 5: Ethical Discussions and AI Safety


PART 1: High level Discord summaries

Cursor IDE Discord


Eleuther Discord


HuggingFace Discord


Unsloth AI (Daniel Han) Discord


aider (Paul Gauthier) Discord


Modular (Mojo 🔥) Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


OpenAI Discord


Notebook LM Discord Discord


Stability.ai (Stable Diffusion) Discord


GPU MODE Discord


LM Studio Discord


Interconnects (Nathan Lambert) Discord


Cohere Discord


Nous Research AI Discord


Latent Space Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


LLM Agents (Berkeley MOOC) Discord


OpenInterpreter Discord


Torchtune Discord


DSPy Discord


Axolotl AI Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LAION Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Cursor IDE ▷ #general (875 messages🔥🔥🔥):

Cursor Composer updates, Cursor agent functionality, Windsurf IDE comparison, Cursor version rollouts, User experiences with AI models

Links mentioned:


Eleuther ▷ #general (69 messages🔥🔥):

Zombo.com references, Experience with sfcompute, PyTorch parallelism wrappers, Configuration complexity challenges, FSDP behavior and module properties

Links mentioned:


Eleuther ▷ #research (83 messages🔥🔥):

Gradient Estimation in ML, UltraMem Architecture, Optimizer Evaluation Suite, Diffusion Models in Other Modalities, Learning Rate Sensitivity

Links mentioned:


Eleuther ▷ #scaling-laws (1 messages):

Cross-entropy loss curves, Datasets for LLMs training


Eleuther ▷ #interpretability-general (2 messages):

AISI, Meeting Setup


Eleuther ▷ #lm-thunderdome (32 messages🔥):

Evaluation of Quantization Effects, KV Cache Importance in Deployments, Model Performance and Comparison, LM Eval Error Handling, Perplexity as Evaluation Metric

Link mentioned: general question: Is kv-cache actually not used in all the LLM-evaluation tasks? · Issue #1105 · EleutherAI/lm-evaluation-harness: Is kv-cache actually not used in all the LLM-evaluation tasks, since those tasks usually takes only one-step attention calculation, not like language generating process which needs a lot of kv-cach...


HuggingFace ▷ #NLP (173 messages🔥🔥):

Language models and code generation, Quantum consciousness theories, Neural networks and algorithms, AI tools for music continuation, Complexity in AI discussions

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (98 messages🔥🔥):

Unsloth model updates, GPU price discussions, Inference performance issues, Qwen2.5 fixes, Model loading strategies

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (4 messages):

Cheap options not enabling SSH, RTX 3090 pricing, PrimeIntellect GPU hosting


Unsloth AI (Daniel Han) ▷ #help (34 messages🔥):

Kaggle Progress Bar Issue, Performance of P100 vs T4, Gemma Quantization Questions, Fine-tuning on Unlabeled Tweets, Model Loading Errors

Link mentioned: Wow GIF - Wow - Discover & Share GIFs: Click to view the GIF


aider (Paul Gauthier) ▷ #announcements (1 messages):

Aider v0.65.0 release, Custom model aliases, RepoMap support for Dart language, Analytics opt-in feature, Error handling improvements


aider (Paul Gauthier) ▷ #general (80 messages🔥🔥):

Hyperbolic Model Context, Sonnet vs O1 Models, Using Aider for Website Publishing, Aider Task Management, Model Aliases and Versioning

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (49 messages🔥):

Model Context Protocol, Aider Upgrades, Connecting Aider with VS Code, Token Limit Issues, Voice Function Costs

Links mentioned:


aider (Paul Gauthier) ▷ #links (2 messages):

MCP Server for Git, Integration with Aider, Add-ons for Aider, Standardized Capabilities

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (128 messages🔥🔥):

Segmentation faults in Mojo, Mojo QA Bot Performance, Thread Safety and Mutex in Mojo, Function Parameter and Mutability, Error Handling for Ref Types

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

Companion emotional scoring system, Enhanced interaction realism, Security improvements in Companion, Automated security audits


OpenRouter (Alex Atallah) ▷ #general (98 messages🔥🔥):

OpenRouter API Key Issues, Performance of Gemini Models, Usage of Models and Document Types, Chat Synchronization Across Devices, Limitations of Free Models

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (3 messages):

Access to Integrations, Access to Custom Provider Keys


Perplexity AI ▷ #general (88 messages🔥🔥):

Discord Bot Creation, Perplexity AI Subscription Plans, Model Comparison in Programming, DeepSeek R1 Feedback, Refund Process for API Credits

Link mentioned: Streamlit: no description found


Perplexity AI ▷ #sharing (9 messages🔥):

QNET MLM Scheme Warning, Cloud Hosting Black Friday Deals, Bengaluru Adaptive Traffic Control, EU's Gorilla Glass Investigation, Representation Theory Breakthrough in Algebra

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (3 messages):

Closed Beta Program, User Concerns, Arthritis Discussion


OpenAI ▷ #ai-discussions (82 messages🔥🔥):

Impact of AI on jobs, Human-AI collaboration, Real-time API, AI in gaming, AI's understanding of emojis

Link mentioned: GitHub - openai/openai-realtime-console: React app for inspecting, building and debugging with the Realtime API: React app for inspecting, building and debugging with the Realtime API - openai/openai-realtime-console


OpenAI ▷ #gpt-4-discussions (1 messages):

vvvvvvvvvvvvvvvvvvv_: Is anyone experiencing issues with saving custom GPTs?


OpenAI ▷ #prompt-engineering (6 messages):

Challenges with Research Papers, AI for Web Interaction, Self-Operating Computer, Claude 3.5 Sonnet, Google's Jarvis


OpenAI ▷ #api-discussions (6 messages):

Research Paper Challenges, AI Web Interaction, Self-Operating Computer, Claude 3.5 Sonnet, Google's Jarvis


Notebook LM Discord ▷ #use-cases (30 messages🔥):

AI Podcasting, Customer Data Management, Educational Content Marketing, Audio Overview Functionality, Virtual Podcast Hosts

Links mentioned:


Notebook LM Discord ▷ #general (53 messages🔥):

NotebookLM Features and Functionality, User Experiences with Document Handling, Issues with Language and Translations, Concerns about AI Data Usage, Audio Overview Customization

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #announcements (1 messages):

ControlNets for Stable Diffusion 3.5, Commercial and Non-Commercial Licensing, Ownership of Generated Media

Link mentioned: ControlNets for Stable Diffusion 3.5 Large — Stability AI: Today we are adding new capabilities to Stable Diffusion 3.5 Large by releasing three ControlNets: Blur, Canny, and Depth. 


Stability.ai (Stable Diffusion) ▷ #general-chat (76 messages🔥🔥):

User Support Communication Issues, Utilizing Wildcards for Prompts, SDXL Model Loading Times, Finding AI Tools and Resources, Managing Loras by Checkpoint

Links mentioned:


GPU MODE ▷ #general (4 messages):

Kashmiri Text Corpus Dataset, Lecture 12 Flash Attention, LLM Fine-tuning Issues, Model Loading Problems, Multi-GPU Training

Link mentioned: Omarrran/Kashmiri__Text_Corpus_Dataset · Datasets at Hugging Face: no description found


GPU MODE ▷ #triton (2 messages):

Triton escape hatch, FP8 vs INT8 performance


GPU MODE ▷ #cuda (31 messages🔥):

Odd behavior in CUDA simulations, Random number generation initialization, Memory allocation and initialization, CUDA optimizations for ML applications, Kernel fusion in CUDA

Link mentioned: Simple Portable C++ Seed Entropy: How to cope with the flaws of the C++ random device


GPU MODE ▷ #torch (4 messages):

GPU Memory Footprint Issues, PyTorch CPU Affinity and NUMA, CUDA OOM Errors from Reserved Memory, Flops Calculation for GPT-2, Inference Latency in Transformers

Link mentioned: Transformer Inference Arithmetic | kipply's blog): kipply's blog about stuff she does or reads about or observes


GPU MODE ▷ #cool-links (4 messages):

FP8 Training, Performance Gains with FSDP2, Meta LLaMa Model Architecture

Link mentioned: Supercharging Training using float8 and FSDP2: IBM: Tuan Hoang Trong, Alexei Karve, Yan Koyfman, Linsong Chu, Divya Kumari, Shweta Salaria, Robert Walkup, Praneet Adusumilli, Nirmit Desai, Raghu Ganti, Seetharami SeelamMeta: Less Wright, Wei Feng,...


GPU MODE ▷ #jobs (13 messages🔥):

Hugging Face Internship, Application Details, FAQs on Internship Requirements

Links mentioned:


GPU MODE ▷ #torchao (2 messages):

Benchmarking Quantization Techniques, Glossary for Terminology

Links mentioned:


GPU MODE ▷ #🍿 (4 messages):

Function Cost Calculation, Execution Time as Proxy for Cost, Modal Functions Overview

Link mentioned: modal.functions: Functions are the basic units of serverless execution on Modal.


LM Studio ▷ #general (46 messages🔥):

Beta Builds Concerns, AMD Multi-GPU Support, LM Studio Performance, Model Usage and API Queries, Token Display During Inference

Link mentioned: GitHub - XiongjieDai/GPU-Benchmarks-on-LLM-Inference: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference? - XiongjieDai/GPU-Benchmarks-on-LLM-Inference


LM Studio ▷ #hardware-discussion (10 messages🔥):

Second 3090 considerations, Black Friday full tower deals, Motherboard requirements for second GPU, PCIe slot configurations, Cooling solutions for multiple GPUs


Interconnects (Nathan Lambert) ▷ #news (17 messages🔥):

Tülu 3 8B shelf life, Olmo model comparisons, Multilingual capabilities, SFT data removal impact, Pre-training efficiency

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (1 messages):

Return of Activity, Screenshot Analysis


Interconnects (Nathan Lambert) ▷ #random (31 messages🔥):

Sora API Leak, OpenAI Corporate Practices, Artist Community Reactions, Hugging Face Usage, Public Perception Management

Links mentioned:


Interconnects (Nathan Lambert) ▷ #posts (1 messages):

SnailBot News: <@&1216534966205284433>


Cohere ▷ #discussions (1 messages):

_reamer: Absolute layman, just exited teenagehood


Cohere ▷ #questions (19 messages🔥):

Transitioning to Production API Key, Error 500 on Embeddings Endpoint, Issues with Command R+ Model Outputs, Inconsistent Language Responses in Bulgarian, Credit Card Details Issue


Cohere ▷ #api-discussions (3 messages):

Embed endpoint errors, Error 500 reports, Support communication


Cohere ▷ #projects (10 messages🔥):

Cohere API Key Limitations, Companion's Emotional Scoring System, Open Source Models, Support for Project Development

Link mentioned: Login | Cohere: Login for access to advanced Large Language Models and NLP tools through one easy-to-use API.


Nous Research AI ▷ #general (25 messages🔥):

Test Time Inference, Real-Time Video Models, Genomic Bottleneck Algorithm, Nous Flash Agent Setup

Link mentioned:
The next evolution of AI begins with ours: Neuroscientists devise a potential explanation for innate ability
: In a sense, each of us begins life ready for action. Many animals perform amazing feats soon after they're born. Spiders spin webs. Whales swim. But where do these innate abilities come from? Ob...


Nous Research AI ▷ #ask-about-llms (1 messages):

vondr4gon: Is there a test time training project ongoing currently?


Nous Research AI ▷ #research-papers (1 messages):

jsarnecki: https://arxiv.org/abs/2411.14405


Nous Research AI ▷ #interesting-links (1 messages):

Coalescence for LLM inference, Finite State Machines transformation, Token-based FSM transitions, Outlines library usage

Link mentioned: Coalescence: making LLM inference 5x faster: no description found


Nous Research AI ▷ #research-papers (1 messages):

jsarnecki: https://arxiv.org/abs/2411.14405


Latent Space ▷ #ai-general-chat (20 messages🔥):

Model Context Protocol (MCP), Sora API Leak, OLMo 2 Release, Funding for PlayAI, Customization in Claude Responses

Links mentioned:


LlamaIndex ▷ #blog (3 messages):

LlamaParse, NLP research papers, RAG system optimization, Ragas and LlamaIndex


LlamaIndex ▷ #general (16 messages🔥):

llama_deploy errors, OpenAIAgent customization, Retrieving specific embedding model, Startup launch announcement, MCP service for llama index

Link mentioned: Tweet from undefined: no description found


tinygrad (George Hotz) ▷ #general (7 messages):

Flash Attention Integration, Tinybox Pro Custom Motherboard, GENOA2D24G-2L+ CPU, PCIe 5 Cable Compatibility, Tinygrad CPU Documentation


tinygrad (George Hotz) ▷ #learn-tinygrad (3 messages):

Optimization with scatter, Radix Sort enhancements, Non-sequential data processing, GPU Radix Sort paper by AMD


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Hackathon Workshop, Google AI, Live Q&A

Link mentioned: LLM Agents MOOC Hackathon - Google workshop: no description found


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

Lecture 11 Overview, Measuring Agent Capabilities, Responsible Scaling Policy, Benjamin Mann's Background

Link mentioned: CS 194/294-196 (LLM Agents) - Lecture 11, Ben Mann: no description found


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (2 messages):

Anthropic API keys


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (2 messages):

In-person lectures, Berkeley student enrollment


LLM Agents (Berkeley MOOC) ▷ #mooc-readings-discussion (1 messages):

GSM8K Inference Pricing, Self-Correction in Models


OpenInterpreter ▷ #general (7 messages):

OpenInterpreter 1.0 release, Non-Claude OS mode, Developer branch integration, Speech-to-text functionality, Keyboard input simulation


Torchtune ▷ #general (1 messages):

Torchtitan Poll, Feature Requests

Link mentioned: Tweet from Horace He (@cHHillee): If you'd like to influence what features the PyTorch distributed team work on in torchtitan (e.g. MoE, multimodal, context parallelism, etc.), go made your voices heard herehttps://github.com/pyt...


Torchtune ▷ #dev (3 messages):

DPO usage, PPO Contributions, Mark's contributions


DSPy ▷ #general (3 messages):

DSPy Learning Support, Observers Integration

Link mentioned: Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK: no description found


Axolotl AI ▷ #general (3 messages):

Accelerate PR Fix, Hyberbolic Labs Black Friday GPU Deal

Link mentioned: support for wrapped schedulefree optimizer when using deepspeed by winglian · Pull Request #3266 · huggingface/accelerate: What does this PR do?Axolotl community reported an issue with schedule free AdamW with deepspeed:[rank0]: File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/train....






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}