Frozen AI News archive

not much happened today

**Meta** released **Llama 3.2**, including lightweight 1B and 3B models for on-device AI with capabilities like summarization and retrieval-augmented generation. **Molmo**, a new multimodal model, was introduced with a large dense captioning dataset. **Google DeepMind** announced **AlphaChip**, an AI-driven chip design method improving TPU and CPU designs. **Hugging Face** surpassed 1 million free public models, highlighting the value of smaller specialized models. Discussions covered challenges in scaling RAG applications, the future of on-device AI running ChatGPT-level models, reliability issues in larger LLMs, and new Elo benchmarking accepted at NeurIPS 2024. AI ethics and regulation topics included free speech responsibilities and California's SB-1047 bill potentially affecting open-source AI. *"AlphaChip transformed computer chip design,"* and *"ChatGPT-level AI on mobile devices predicted within a year."*

Canonical issue URL

AI News for 9/26/2024-9/27/2024. We checked 7 subreddits, 433 Twitters and 31 Discords (224 channels, and 2635 messages) for you. Estimated reading time saved (at 200wpm): 288 minutes. You can now tag @smol_ai for AINews discussions!

Just a lot of non-headline news today:

You could tune in to the latest Latent Space with Shunyu Yao and Harrison Chase while you browse the news below!

If you are in SF for DevDay, consider bringing your demos and hot takes to our DevDay pregame on Monday.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Releases and Developments

AI Infrastructure and Platforms

AI Research and Benchmarks

AI Ethics and Regulation

AI Development Tools and Techniques


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Llama 3.2: Performance Gains and EU Regulatory Challenges

Theme 2. Next-Gen Hardware for AI: NVIDIA RTX 5090 Specs Leaked

Theme 3. Quantization and Performance Analysis of Large Language Models

Theme 4. Advancements in Creative Writing and Roleplay AI Models

Theme 5. Hugging Face Milestone: 1 Million Models

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Research and Model Developments

AI Industry and Company News

AI Policy and Societal Impact

AI Model Releases and Improvements


AI Discord Recap

A summary of Summaries of Summaries by O1-mini

Theme 1. Language Model Performance and New Releases

Theme 2. Tooling, Integrations and New Features

Theme 3. Hardware and GPU Performance in AI Workloads

Theme 4. Deployment Updates and API Enhancements

Theme 5. Model Training and Optimization Techniques


PART 1: High level Discord summaries

aider (Paul Gauthier) Discord


LM Studio Discord


GPU MODE Discord


Unsloth AI (Daniel Han) Discord


HuggingFace Discord


OpenRouter (Alex Atallah) Discord


Cohere Discord


Stability.ai (Stable Diffusion) Discord


Perplexity AI Discord


Nous Research AI Discord


OpenAI Discord


Eleuther Discord


DSPy Discord


Modular (Mojo 🔥) Discord


Latent Space Discord


LlamaIndex Discord


Interconnects (Nathan Lambert) Discord


OpenAccess AI Collective (axolotl) Discord


tinygrad (George Hotz) Discord


LAION Discord


LLM Agents (Berkeley MOOC) Discord


OpenInterpreter Discord


LangChain AI Discord


Torchtune Discord


Gorilla LLM (Berkeley Function Calling) Discord


AI21 Labs (Jamba) Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

aider (Paul Gauthier) ▷ #general (328 messages🔥🔥):

  • Architect/Editor Mode
  • Model Comparisons
  • Copy Command Feature
  • File Handling in Aider
  • Token Usage and Efficiency

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (22 messages🔥):

  • Feedback loops with Aider
  • Streamlit limitations
  • File creation issues
  • Claude 3.5 benefits
  • Token usage in Aider

Link mentioned: YAML config file: How to configure aider with a yaml config file.


aider (Paul Gauthier) ▷ #links (1 messages):

fry69_61685: https://erikbern.com/2024/09/27/its-hard-to-write-code-for-humans.html


LM Studio ▷ #general (77 messages🔥🔥):

  • Molmo and LM Studio
  • Llama 3.2 Capabilities
  • LM Studio Update Issues
  • Using CLI for Local Models
  • Conversation Export in LM Studio

Links mentioned:


LM Studio ▷ #hardware-discussion (246 messages🔥🔥):

  • Performance of 70B models at low Q
  • Rumors about NVIDIA's RTX 5090 and 5080
  • Comparison of different GPU options for AI
  • Load testing methods for LLMs
  • CPU cooling issues and upgrades

Links mentioned:


GPU MODE ▷ #general (4 messages):

  • Llama 3.2 vision models
  • Cerebras chip optimization

Link mentioned: server: Bring back multimodal support · Issue #8010 · ggerganov/llama.cpp: Multimodal has been removed since #5882 Depends on the refactoring of llava, we will be able to bring back the support: #6027 This issue is created mostly for tracking purpose. If someone want to t...


GPU MODE ▷ #triton (26 messages🔥):

  • Triton Windows Wheel
  • Understanding BLOCK_SIZE
  • Inline Assembly in Triton
  • TMA in Triton Kernel
  • MPS Fault Handling

Links mentioned:


GPU MODE ▷ #torch (17 messages🔥):

  • PyTorch Profiler performance counters
  • torch.flip HIP error
  • Swin2SR GitHub repository
  • PyTorch benchmarking repositories
  • Updating dictionaries in TorchScript

Link mentioned: GitHub - mv-lab/swin2sr: [ECCV] Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration. Advances in Image Manipulation (AIM) workshop ECCV 2022. Try it out! over 3.3M runs https://replicate.com/mv-lab/swin2sr: [ECCV] Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration. Advances in Image Manipulation (AIM) workshop ECCV 2022. Try it out! over 3.3M runs https://replicate.com/...


GPU MODE ▷ #beginner (6 messages):

  • .clangd Configurations
  • CUDA Path Issues
  • Profiling Kernel Function Performance

GPU MODE ▷ #pmpp-book (5 messages):

  • Image Processing in C
  • Popular Book Sections
  • stb_image Library
  • Yann LeCun Mention

Link mentioned: stb/stb_image.h at master · nothings/stb: stb single-file public domain libraries for C/C++. Contribute to nothings/stb development by creating an account on GitHub.


GPU MODE ▷ #torchao (138 messages🔥🔥):

  • Windows support for GPU models
  • FP8 and Int8 training issues
  • Torchao performance profiling
  • Issues with quantized training
  • Links and resources on GPU programming

Links mentioned:


GPU MODE ▷ #off-topic (17 messages🔥):

  • Edge LLM Challenge
  • Integration of Sonnet and Voice
  • Prompting vs Speaking
  • Meta AR Glasses
  • Code Execution on New Platforms

Link mentioned: no title found: no description found


GPU MODE ▷ #irl-meetup (2 messages):

  • Meetups in Guatemala
  • GPU reading/work groups in London

GPU MODE ▷ #llmdotc (33 messages🔥):

  • RMSNorm integration
  • MLP block backpropagation
  • Kernel efficiency concerns
  • Attention backward pass issues
  • RepKV backward debugging

GPU MODE ▷ #rocm (6 messages):

  • TensorWave MI300X Offer
  • Community Engagement

GPU MODE ▷ #bitnet (6 messages):

  • Quantized Training Repo
  • Using Multi-GPU for Training
  • Distillation from Quantized Model
  • Config File for Larger Models

Link mentioned: GitHub - gau-nernst/quantized-training: Explore training for quantized models: Explore training for quantized models. Contribute to gau-nernst/quantized-training development by creating an account on GitHub.


GPU MODE ▷ #webgpu (1 messages):

  • LiteRT functionalities
  • gpu.cpp cross-platform capabilities

GPU MODE ▷ #liger-kernel (5 messages):

  • Liger Kernel Weight Handling
  • Family Trip Update
  • Lambda Vendor Recommendation

Links mentioned:


GPU MODE ▷ #metal (2 messages):

  • Apple hardware support
  • Metal Shading Language Specification

GPU MODE ▷ #self-promotion (1 messages):

  • Live Meeting Announcement

Link mentioned: Join conversation: no description found


GPU MODE ▷ #diffusion (9 messages🔥):

  • M2 Pro Benchmarks
  • DiffusionKit
  • Flux Diagram
  • Mini Diffusion Model
  • Visuals in Chat

Link mentioned: GitHub - argmaxinc/DiffusionKit: On-device Inference of Diffusion Models for Apple Silicon: On-device Inference of Diffusion Models for Apple Silicon - argmaxinc/DiffusionKit


Unsloth AI (Daniel Han) ▷ #general (176 messages🔥🔥):

  • Llama Model Fine-tuning
  • Model Checkpoints and Loading Issues
  • Graphics Card Rumors
  • Training Neural Networks
  • AI Application in Gaming

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (6 messages):

  • Job search frustrations
  • Active AI subscriptions
  • AI relationships

Unsloth AI (Daniel Han) ▷ #help (80 messages🔥🔥):

  • Transformers Updates
  • Quantization Issues
  • Fine-Tuning Techniques
  • Model Loading Challenges
  • Optimizer Errors in Lighting AI

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (13 messages🔥):

  • Data Packing in Training
  • Frameworks for Pretraining GPT-2
  • Discussion Etiquette in Technical Queries
  • Deepspeed for Pretraining
  • Handling Data Masking

HuggingFace ▷ #general (184 messages🔥🔥):

  • Discussion on Hugging Face models
  • Challenges with uncensored models
  • Techniques for bypassing AI restrictions
  • Creating datasets for chat models
  • Using multiple LLMs for aggregation

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

  • Neuralink CUDA usage
  • 7b FP8 model
  • BF16 and FP32 confusion

HuggingFace ▷ #cool-finds (9 messages🔥):

  • Two Minute Papers
  • Alibaba's MIMO
  • Tokenizer Training Research
  • Interactive Scene Control

Links mentioned:


HuggingFace ▷ #i-made-this (12 messages🔥):

  • VividNode update
  • AI tweet going viral
  • Game promotion
  • Leaderboard feedback
  • Flux-schnell demo

Link mentioned: Release v1.2.0 · yjg30737/pyqt-openai: VividNode(pyqt-openai) v1.2.0 Release Notes New Features Random Prompt Generation: Added a feature for random prompt generation to enable continuous image creation TTS and STT Support: Implemente...


HuggingFace ▷ #computer-vision (4 messages):

  • 4D Scene Understanding
  • 3D Data Rendering
  • Temporal Data in Models
  • 2D Video Consistency
  • Achievements in Computer Vision

HuggingFace ▷ #NLP (2 messages):

  • Tokenizer Training for Multilingual LLMs
  • Pytorch Techniques for Weight Management

HuggingFace ▷ #diffusion-discussions (9 messages🔥):

  • Cybersecurity services
  • Text-to-video model training
  • Image sharpening technique
  • Flux.1-dev optimization

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (3 messages):

  • Gemini Tokenization
  • Database Upgrade Delay
  • Chatroom UI Enhancements

Link mentioned: Tweet from OpenRouter (@OpenRouterAI): The Chatroom now shows responses from models with their reasoning collapsed by default. o1 vs Gemini vs Sonnet on 🍓:


OpenRouter (Alex Atallah) ▷ #general (186 messages🔥🔥):

  • Llama 3.2 vision parameters
  • OpenRouter error messages
  • Claude 3.5 Sonnet tool calling issues
  • Translation model recommendations
  • Model hosting criteria on OpenRouter

Links mentioned:


Cohere ▷ #discussions (6 messages):

  • Channel Etiquette
  • Project Progress

Cohere ▷ #questions (19 messages🔥):

  • Finetuning embed-english-v3
  • Custom embedding models
  • RAG inclusions format
  • Embeddings in construction domain

Cohere ▷ #api-discussions (146 messages🔥🔥):

  • New API v2 endpoints
  • Flashcard generation
  • Fine-tuning models
  • Rate limits for trial keys

Link mentioned: API Keys and Rate Limits — Cohere: This page describes the limitations around Cohere's API.


Cohere ▷ #projects (1 messages):

  • Cultural Multilingual LMM Benchmark
  • Volunteer Native Translators
  • Co-authorship Invitation
  • CVPR 2025 Submission

Stability.ai (Stable Diffusion) ▷ #general-chat (160 messages🔥🔥):

  • Tiled Upscale vs ADetailer
  • Using AMD GPUs for SD
  • Performance Differences of GPUs
  • Refurbished vs Used GPUs
  • SSD Impact on Model Load Times

Links mentioned:


Perplexity AI ▷ #general (74 messages🔥🔥):

  • User Interface Issues
  • API Functionality Queries
  • Subscription Promotions
  • Freelancing Platforms
  • Model Availability

Perplexity AI ▷ #sharing (8 messages🔥):

  • Meta's Orion AR Glasses
  • OpenAI's For-Profit Pivot
  • New Blood Type Discovery
  • Skin Cancer Information
  • Neural Fields in Visual Computation

Nous Research AI ▷ #general (60 messages🔥🔥):

  • Memory Size in GPUs
  • DisTrO Paper Release
  • Knowledge Graphs and AI
  • Claude Sonnet 3.5 Performance
  • HW/SW Integration in AI

Links mentioned:


Nous Research AI ▷ #ask-about-llms (17 messages🔥):

  • Hermes deployment options
  • Llama 3.2 requirements
  • Hyperparameter adjustment for models

Nous Research AI ▷ #research-papers (1 messages):

  • Arduino-Based Current Sensor
  • Power Outage Detection
  • Related Research Access

Nous Research AI ▷ #research-papers (1 messages):

  • Arduino-Based Current Sensor
  • Power Outage Detection
  • Research Literature Access

OpenAI ▷ #ai-discussions (61 messages🔥🔥):

  • Agentic Search Challenges
  • AI in Education
  • Energy Use of AI
  • AI Tools for Productivity
  • Future Generations and Technology

OpenAI ▷ #gpt-4-discussions (16 messages🔥):

  • Voice feature issues
  • Advanced voice mode functionality
  • Attachment capabilities
  • Deployment timelines

Eleuther ▷ #general (10 messages🔥):

  • Open Source Model Sponsorship
  • LLM Search Space Simulation
  • OpenAI Function Calling API
  • Model Validity and Tuning

Eleuther ▷ #research (52 messages🔥):

  • FP6 and FP16 Weight Distributions
  • Verbatim Memorization in LLMs
  • Looped Transformers vs Universal Transformers
  • Layerwise Positional Encoding
  • Confidence Metrics in Inference

Links mentioned:


Eleuther ▷ #interpretability-general (1 messages):

  • Embedding states in KV
  • Text representation factors

Eleuther ▷ #multimodal-general (2 messages):

  • Vision LLMs
  • ColQwen2 Model
  • Visual Retrievers

Link mentioned: Tweet from Manuel Faysse (@ManuelFaysse): 🚨 New model alert: ColQwen2 ! It's ColPali, but with a Qwen2-VL backbone, making it the best visual retriever to date, topping the Vidore Leaderboard with a significant +5.1 nDCG@5 w.r.t. colpali...


Eleuther ▷ #gpt-neox-dev (4 messages):

  • Testing on H100s
  • FA3 integration
  • Maintaining FA2 alongside FA3

Links mentioned:


DSPy ▷ #show-and-tell (15 messages🔥):

  • Langtrace integration with DSPy
  • MIPROv2 compilation runs
  • Experiment tracking issues

Link mentioned: DSPy - Langtrace AI Docs: no description found


DSPy ▷ #general (38 messages🔥):

  • BootstrapFewshot Page Availability
  • New LM with Azure's OpenAI APIs
  • DSPy Optimization Tools
  • Nesting Signatures in DSPy
  • Building DSPy Analytics Pipeline

Links mentioned:


DSPy ▷ #examples (8 messages🔥):

  • DSPy ReAct agents
  • RAG agents integration
  • Multiple RAG tools
  • Vector databases integration
  • Multimodal RAG optimization

Modular (Mojo 🔥) ▷ #general (2 messages):

  • Mojo MAX desktop backgrounds
  • Emoji Voting

Modular (Mojo 🔥) ▷ #announcements (1 messages):

  • Verification Requirements
  • Posting Restrictions

Modular (Mojo 🔥) ▷ #mojo (58 messages🔥🔥):

  • Error handling in Mojo
  • Improvements to Variant type
  • Sum types in programming languages
  • Mojo documentation needs
  • Pattern matching and exhaustiveness checking

Link mentioned: rfcs/text/0000-partial_types.md at partial_types3 · VitWW/rfcs: RFCs for changes to Rust. Contribute to VitWW/rfcs development by creating an account on GitHub.


Latent Space ▷ #ai-general-chat (20 messages🔥):

  • FTC Crackdown on AI Tool Claims
  • Concerns About Generative AI Sustainability
  • Geohot's Frustration with AMD
  • ColPali Model with Qwen2-VL
  • Effectiveness of AI in Software Development

Links mentioned:


Latent Space ▷ #ai-in-action-club (38 messages🔥):

  • AI Engineering Interviews
  • Screen Sharing Issues
  • Using Local Models
  • Braintrust JSON Mode
  • Cot Experimentation

LlamaIndex ▷ #blog (4 messages):

  • Paragon integration
  • Langfuse and PostHog tutorial
  • LlamaIndex with Box
  • FinanceAgentToolSpec
  • RAG and LlamaIndex

LlamaIndex ▷ #general (28 messages🔥):

  • NLTK Resource Issue
  • Loading Fine-tuned Models on GPU
  • Best Open Source Vector Database
  • Self-hosted Observability Tools
  • Vector Search Optimization Strategy

Link mentioned: Observability - LlamaIndex: no description found


Interconnects (Nathan Lambert) ▷ #news (11 messages🔥):

  • OpenAI's rushed GPT-4o release
  • Safety staff challenges
  • Employee compensation demands
  • Leadership turnover
  • Talent recruitment efforts

Link mentioned: Tweet from Garrison Lovely (@GarrisonLovely): This article is full of bombshells. Excellent reporting by @dseetharaman. The biggest one: OpenAI rushed testing of GPT-4o (already reported), released the model and then subsequently determined the...


Interconnects (Nathan Lambert) ▷ #ml-drama (15 messages🔥):

  • OpenAI leadership changes
  • Public statements of employees
  • AI culture differences
  • Emotional responses in tech
  • Gamer culture terms

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (4 messages):

  • Substack Best Seller
  • Apple App Store Management

Interconnects (Nathan Lambert) ▷ #memes (1 messages):

420gunna: https://x.com/venturetwins/status/1839685317462458650 Instalocking this


OpenAccess AI Collective (axolotl) ▷ #general (7 messages):

  • Multimodal support
  • Area Chair roles
  • Conversation splitting in training

OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (19 messages🔥):

  • Multi-modal VLM Assistance
  • YAML Configuration Issues
  • Flex Attention Discussion
  • LoRA+ Optimization Update
  • Default Learning Rates in LoRA+

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #runpod-help (1 messages):

invisietch: Fp8 runs on 2x 80GB A100 for me, should be fine on 2x H100 also


tinygrad (George Hotz) ▷ #general (21 messages🔥):

  • Nvidia P2P Support and IOMMU
  • Pricing and Competition in GPU Cloud Services
  • CLOUD=1 Service Details
  • Data Upload Challenges for Training
  • Persistent Storage Billing

tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

  • Pull Request #6779
  • Device Loading Issues
  • PR Comparison

Link mentioned: get_available_backends for device by i-jared · Pull Request #6779 · tinygrad/tinygrad: Try loading each backend. Return any that load successfully. It's not 1 line like George wanted #6689, but it works and follows conventions existing in the codebase.


LAION ▷ #general (15 messages🔥):

  • Llama 3.2 11B Vision
  • Voice Cloning
  • Family Photo Generation
  • Copyright Enforcement
  • Maintaining Independence

Links mentioned:


LAION ▷ #research (8 messages🔥):

  • Positional Information in CNNs
  • Positional Encoding in Transformers
  • Scaling Laws in Machine Learning
  • Fourier Feature Extraction
  • Trends in Neural Network Architectures

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (16 messages🔥):

  • Lecture Coverage on Social Alignment
  • Course Enrollment Confirmation
  • Assignment Deadlines and Clarifications
  • Qquiz Availability
  • Lab Assignment Release Timing

Link mentioned: Large Language Model Agents: no description found


OpenInterpreter ▷ #general (6 messages):

  • OpenInterpreter application
  • Multimodal support in LLaMA
  • Frontend development for OI
  • On-chain analytics demonstration

Links mentioned:


OpenInterpreter ▷ #O1 (7 messages):

  • Decoding Packet Error
  • Server Connection Issues
  • Request for Setup Information

OpenInterpreter ▷ #ai-content (3 messages):

  • HF 90b vision update
  • Impact of OpenInterpreter

Link mentioned: Tweet from Mike Bird (@MikeBirdTech): One year ago today, I made a little demo of this cool new tool I found online. Just wanted to show off what it could do and then it went a little viral Since then @OpenInterpreter has completely chan...


LangChain AI ▷ #general (2 messages):

  • Vector search optimization
  • Contextual extraction from Excel

LangChain AI ▷ #share-your-work (2 messages):

  • CF Booking Chatbot
  • Unize Storage AI System
  • Knowledge Graph Generation
  • Google Calendar Integration
  • LangChain Performance Comparison

Links mentioned:


Torchtune ▷ #general (3 messages):

  • PackedDataset constraint
  • max_seq_len handling
  • RuntimeError in dataset processing
  • Error discussion on GitHub

Link mentioned: torchtune/torchtune/datasets/_packed.py at main · pytorch/torchtune: A Native-PyTorch Library for LLM Fine-tuning. Contribute to pytorch/torchtune development by creating an account on GitHub.


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

  • Function Calling Evaluation
  • Customization of Evaluation Dataset
  • Integration with LLMs
  • Berkeley Function-Calling Leaderboard
  • Error Breakdown Analysis

Link mentioned: Berkeley Function Calling Leaderboard: no description found


AI21 Labs (Jamba) ▷ #general-chat (1 messages):

azaw: How do we use the openAI sdk for jamba ? is it possible ?

{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}