Frozen AI News archive

Liquid Foundation Models: A New Transformers alternative + AINews Pod 2

**Liquid.ai** emerged from stealth with three subquadratic foundation models demonstrating superior efficiency compared to state space models and Apple’s on-device and server models, backed by a $37M seed round. **Meta AI** announced **Llama 3.2** with multimodal vision-enabled models and lightweight text-only variants for mobile. **Google DeepMind** introduced production-ready **Gemini-1.5-Pro-002** and **Gemini-1.5-Flash-002** models with improved pricing and rate limits, alongside **AlphaChip**, an AI-driven chip design system using reinforcement learning for rapid superhuman layouts. **OpenAI** enhanced ChatGPT Plus and Teams with Advanced Voice Mode featuring Custom Instructions, Memory, and new nature-inspired voices. California Governor vetoed SB-1047 AI regulation bill, celebrated by AI community figures like **ylecun** and **svpino** as a win for open-source AI. Google upgraded **NotebookLM** with audio overviews supporting YouTube and audio files, turning documents into AI-generated podcasts. *"Open source in AI is thriving,"* noted **ylecun**, highlighting 1 million models on Github and HuggingFace.

Canonical issue URL

AI News for 9/27/2024-9/30/2024. We checked 7 subreddits, 433 Twitters and 31 Discords (225 channels, and 5435 messages) for you. Estimated reading time saved (at 200wpm): 604 minutes. You can now tag @smol_ai for AINews discussions!

It's not every day that a credible new foundation model lab launches, so the prize for today rightfully goes to Liquid.ai, who, 10 months after their $37m seed, finally "came out of stealth" announcing 3 subquadratic models that perform remarkably well for their weight class:

image.png

We know precious little about "liquid networks" compared to state space models, but they have the obligatory subquadratic chart to show that they beat SSMs there:

image.png

with very credible benchmark scores:

image.png

Notably they seem to be noticeably more efficient per parameter than both the Apple on device and server foundation models (our coverage here).

They aren't open source yet, but have a playground and API and have more promised coming up to their Oct 23rd launch.


AINews Pod

We first previewed our Illuminate inspired podcast earlier this month. With NotebookLM Deep Dive going viral, we're building an open source audio version of AINews as a new experiment. See [our latest comparison between NotebookLM and our pod here! Let us know @smol_ai if you have feedback or want the open source repo.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Updates and Developments

Open Source and Regulation

AI Research and Development

Industry Trends and Collaborations


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Emu3: Next-token prediction breakthrough for multimodal AI

Theme 2. Replete-LLM releases fine-tuned Qwen-2.5 models with performance gains

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Model Capabilities and Developments

AI Policy and Regulation

AI Ethics and Societal Impact

Memes and Humor


AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1. AI Models Make Waves with New Releases and Upgrades

Theme 2. AI Regulations and Legal Battles Heat Up

Theme 3. Community Grapples with AI Tool Challenges

Theme 4. Hardware Woes Plague AI Enthusiasts

Theme 5. AI Expands into Creative and Health Domains


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


aider (Paul Gauthier) Discord


HuggingFace Discord


LM Studio Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


Nous Research AI Discord


Perplexity AI Discord


OpenRouter (Alex Atallah) Discord


Stability.ai (Stable Diffusion) Discord


OpenAI Discord


Eleuther Discord


Torchtune Discord


Latent Space Discord


LlamaIndex Discord


Cohere Discord


Interconnects (Nathan Lambert) Discord


LLM Agents (Berkeley MOOC) Discord


tinygrad (George Hotz) Discord


OpenAccess AI Collective (axolotl) Discord


DSPy Discord


OpenInterpreter Discord


LAION Discord


LangChain AI Discord


MLOps @Chipro Discord


DiscoResearch Discord


Mozilla AI Discord


Gorilla LLM (Berkeley Function Calling) Discord


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (920 messages🔥🔥🔥):

  • LinkedIn and Open Source Issues
  • Fine-tuning Llama Models
  • Model Loading Issues
  • Using Unsloth with BitsAndBytes
  • Google Colab Usage

Links mentioned:

<!---

Details

This is an optional section; is there anything specific that reviewers should be aware of? --->

Testing Done...Fine-tuning | How-to guides: Full parameter fine-tuning is a method that fine-tunes all the parameters of all the layers of the pre-trained model. EASIEST Way to Fine-Tune LLAMA-3.2 and Run it in Ollama: Meta recently released Llama 3.2, and this video demonstrates how to fine-tune the 3 billion parameter instruct model using Unsloth and run it locally with O...unsloth (Unsloth AI): no description foundbitsandbytes foundation: bitsandbytes foundation has 2 repositories available. Follow their code on GitHub.llama-recipes/recipes/multilingual/README.md at 0efb8bd31e4359ba9e8f52e8d003d35ff038e081 · meta-llama/llama-recipes: Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&...Home: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory - unslothai/unsloth[FIXED] RuntimeError: Unsloth: Your repo has a LoRA adapter and a base model. · Issue #1061 · unslothai/unsloth: I've trained the unsloth/Llama-3.2-3B-Instruct-bnb-4bit model successfully, but when I try to use it with astLanguageModel.from_pretrained, I get this error: Traceback (most recent call last): Fil...trl/examples/scripts/sft_vlm.py at main · huggingface/trl: Train transformer language models with reinforcement learning. - huggingface/trlGitHub - PygmalionAI/aphrodite-engine: Large-scale LLM inference engine: Large-scale LLM inference engine. Contribute to PygmalionAI/aphrodite-engine development by creating an account on GitHub.v0.44.0: New AdEMAMix optimizer, Embeddings quantization, and more! · bitsandbytes-foundation/bitsandbytes · Discussion #1375: New optimizer: AdEMAMix The AdEMAMix optimizer is a modification to AdamW which proposes tracking two EMAs to better leverage past gradients. This allows for faster convergence with less training d...config.json file not found, fine tuning llama3 with unsloth, after saving the file to hugging face · Issue #421 · unslothai/unsloth: i use unsloth to fine tune llama 3-8B..., after traning complete i save this model to hugging face by using 'push_to_hub', but it shows these files : .gitattributes README.md adapter_config.js...[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file [duplicate] · Issue #1062 · unslothai/unsloth: Hi, i tried finetuning both llama 3.1-8b-instruct and llama 3-8b-instruct following the notebook you provided here. The training phase completed without errors and i generated the gguf quantized at...GitHub - ggerganov/llama.cpp: LLM inference in C/C++: LLM inference in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.unsloth/KTO_+Phi_3_Mini_4K_Instruct+_Unsloth.ipynb at main · asmith26/unsloth: Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory - asmith26/unslothGitHub - unslothai/unsloth: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory - unslothai/unslothGitHub - unslothai/unsloth: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory - unslothai/unslothCompute metrics for generation tasks in SFTTrainer · Issue #862 · huggingface/trl: Hi, I want to include a custom generation based compute_metrics e.g., BLEU, to the SFTTrainer. However, I have difficulties because: The input, eval_preds, into compute_metrics contains a .predicti...[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file · Issue #1065 · unslothai/unsloth: Thank you for developing this useful resource. The Ollama notebook reports {"error":"llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in ...


Unsloth AI (Daniel Han) ▷ #off-topic (17 messages🔥):

  • Compute utilization
  • Software acceleration methods
  • Underutilized hardware performance

Unsloth AI (Daniel Han) ▷ #help (303 messages🔥🔥):

  • Model Fine-Tuning Issues
  • GGUF Conversion Problems
  • Tokenizer and EOS Token Issues
  • Checkpoint Management in Training
  • Using Unsloth with Llama Models

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (9 messages🔥):

  • Referee roles in LLM and finance
  • Liquid Foundation Models

Link mentioned: Tweet from Liquid AI (@LiquidAI_): Today we introduce Liquid Foundation Models (LFMs) to the world with the first series of our Language LFMs: A 1B, 3B, and a 40B model. (/n)


aider (Paul Gauthier) ▷ #announcements (1 messages):

  • Aider v0.58.0 Features
  • Architect/Editor Model Pairing
  • New Model Support
  • Session Enhancements
  • Clipboard Command Updates

aider (Paul Gauthier) ▷ #general (436 messages🔥🔥🔥):

  • Aider's Architect and Editor Models
  • Use of Multiple LLMs
  • DeepSeek Integration
  • Aider User Workflow
  • Prompt Configuration in Aider

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (192 messages🔥🔥):

  • Aider Configuration
  • Architect Mode vs Code Mode
  • Cost Efficiency of Models
  • Using Multiple Git Worktrees
  • Prompt Caching and Token Management

Links mentioned:


aider (Paul Gauthier) ▷ #links (16 messages🔥):

  • NotebookLM audio feature
  • Aider updates
  • AI podcast summarization
  • Content creation automation
  • Hiring decision

Links mentioned:


HuggingFace ▷ #general (464 messages🔥🔥🔥):

  • AI Model Merging
  • Text Similarity in AI
  • Stable Diffusion Performance
  • Video Model Development
  • Hugging Face Community Projects

Links mentioned:


HuggingFace ▷ #today-im-learning (14 messages🔥):

  • Experiments with CUDA
  • Gradio frustrations
  • Model Policy Loss
  • Interface Design Issues

HuggingFace ▷ #cool-finds (9 messages🔥):

  • Medical AI Paper Highlights
  • HuggingFace Model Popularity Metrics
  • Projection Mapping Technology
  • Experiences with Phi Models
  • Video Mapping Techniques

Links mentioned:


HuggingFace ▷ #i-made-this (29 messages🔥):

  • Flux-Schnell Demo
  • Qwen 2.5 Fine-tuning
  • Instrumentum AI Summarizer
  • Deepseek-Chat CoT Mode
  • MusicGen Continuations App

Links mentioned:


HuggingFace ▷ #computer-vision (5 messages):

  • OmDet-Turbo model
  • Keypoint Detection Task
  • SuperPoint Model
  • Fine-tuning TroCR Models
  • Upcoming Models for Keypoint Detection

Link mentioned: SuperPoint: no description found


HuggingFace ▷ #NLP (2 messages):

  • Hallucination Detection Model
  • Fine-tuning BERT on Yelp Dataset

HuggingFace ▷ #diffusion-discussions (6 messages):

  • GitHub API usage
  • Stack Overflow for Developers
  • Increased Context LLaMA Model Conversion
  • llama.cpp compatibility

Links mentioned:


LM Studio ▷ #general (363 messages🔥🔥):

  • Issue with downloading models in LM Studio
  • Using vision-enabled models in LM Studio
  • Feature requests for LM Studio
  • Concerns about model performance and claims
  • Discussion about query queueing and caching

Links mentioned:

...mistralai/Pixtral-12B-2409 · Hugging Face: no description foundGitHub - meta-llama/llama-models: Utilities intended for use with Llama models.: Utilities intended for use with Llama models. Contribute to meta-llama/llama-models development by creating an account on GitHub.


LM Studio ▷ #hardware-discussion (138 messages🔥🔥):

  • NVIDIA Jetson AGX Thor
  • 3090 vs 3090 Ti vs P40 comparisons
  • Market pricing for GPUs
  • AI model hosting and renting
  • Issues with NVIDIA drivers on Linux

Links mentioned:


GPU MODE ▷ #general (30 messages🔥):

  • Cerebras chip optimization
  • Server spam management
  • Triton talk slides
  • Performance metrics for GPUs
  • Robotics development challenges

Links mentioned:


GPU MODE ▷ #triton (12 messages🔥):

  • Triton library functions
  • Block pointers and tmas
  • Triton deep dive lecture
  • Metal MLIR dialect
  • Device compilation in Triton

Links mentioned:


GPU MODE ▷ #torch (35 messages🔥):

  • Batch update option for torchscript hashtable
  • Issues with torch.int_mm() on CPU
  • Debugging AO model replacements
  • Image-loading alternatives to FFCV
  • ZeRO-3 benefits for single GPU inference

Links mentioned:


GPU MODE ▷ #announcements (1 messages):

  • Triton Internals
  • Lecture Schedule
  • Quantized Training
  • Metal Kernels
  • GPU Optimization

GPU MODE ▷ #cool-links (4 messages):

  • AI Discord Servers
  • CuTe/Cutlass Layout Algebra
  • Next-Token Prediction

Links mentioned:


GPU MODE ▷ #beginner (11 messages🔥):

  • Difference between Model Parallelism and ZeRO/FSDP
  • Understanding FSDP mechanics
  • Open source projects in NLP
  • Introduction to LLM research workflow
  • HuggingFace tools and libraries

Links mentioned:


GPU MODE ▷ #youtube-recordings (3 messages):

  • Lecture 29: Triton Internals
  • IRL Meetup Talks Upload

Link mentioned: Lecture 29: Triton Internals: Speaker: Kapil Sharma


GPU MODE ▷ #torchao (35 messages🔥):

  • CPUOffloadOptimizer
  • FP8 and INT8 Quantization
  • Model Profiling
  • Hugging Face Integration
  • SOAP Optim with Flux Tuning

Links mentioned:


GPU MODE ▷ #sequence-parallel (1 messages):

glaxus_: Has anyone seen this for long context inference? https://arxiv.org/pdf/2409.17264v1


GPU MODE ▷ #off-topic (83 messages🔥🔥):

  • GeForce RTX 5090
  • Power supply challenges
  • Apple Watch and LLMs
  • California AI safety bill
  • Cooling solutions for high-end GPUs

Links mentioned:


GPU MODE ▷ #irl-meetup (1 messages):

marcelo5444: Anyone in ECCV Milan?


GPU MODE ▷ #hqq-mobius (1 messages):

  • HQQ model serialization
  • Transformers library

Link mentioned: Hqq serialization by mobicham · Pull Request #33141 · huggingface/transformers: Follow-up to #32379 The goal of this PR is to add full support to save/load HQQ-quantized models directly in transformers. So far, serialization was done on the hqq-lib side via the .pt format whic...


GPU MODE ▷ #llmdotc (23 messages🔥):

  • repkv_backward_kernel2 improvements
  • FP8 implementation strategies
  • Llama3 issues
  • Pre-swizzled layout for FP8
  • Custom matmul kernel developments

Links mentioned:


GPU MODE ▷ #rocm (207 messages🔥🔥):

  • MI300X Access for Community
  • Performance Issues with AMD GPUs
  • Tuning MIOpen Kernels
  • AMD-Llama Model Training
  • Using Triton for Flash Attention

Links mentioned:


GPU MODE ▷ #bitnet (1 messages):

  • Multi-GPU Usage
  • Llama-based Models

GPU MODE ▷ #sparsity-pruning (1 messages):

marksaroufim: https://github.com/pytorch/torchtune/pull/1698


GPU MODE ▷ #webgpu (12 messages🔥):

  • LiteRT vs gpu.cpp
  • WebNN comparison
  • Manual networking in gpu.cpp
  • Buffer Pass Read/Write
  • WebGPU Resources

GPU MODE ▷ #liger-kernel (6 messages):

  • Gemma2 Convergence Tests Failure
  • LLama3.2-Vision Patch Issues
  • Roadmap Tracker for 2024 Q4

Links mentioned:


GPU MODE ▷ #metal (16 messages🔥):

  • Metal Shading Language
  • M2 vs M3 device performance
  • Metal backend for Triton
  • Building on device agents
  • Resource sharing for Metal

Links mentioned:


GPU MODE ▷ #self-promotion (8 messages🔥):

  • Discord AutoMod
  • Spam management
  • Anti-spam tools

GPU MODE ▷ #nccl-in-triton (6 messages):

  • Collaboration on Triton Project
  • Challenges in Memory Management
  • Weak Memory Consistency Models
  • Learning Opportunities in Triton
  • Project Enthusiasm

Modular (Mojo 🔥) ▷ #general (18 messages🔥):

  • Modular Community Meeting
  • Desktop Background Preferences
  • YouTube Meeting Recordings

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (232 messages🔥🔥):

  • Mojo Language Features
  • Embedding Models in Mojo
  • Managing Native Dependencies
  • Mojopkg Enhancements
  • Warnings on MacOS

Links mentioned:


Nous Research AI ▷ #general (189 messages🔥🔥):

  • Nous Research
  • Distro Paper Timeline
  • AI Model Fine-tuning
  • Liquid Foundation Models
  • NLP Research Opportunities

Links mentioned:


Nous Research AI ▷ #ask-about-llms (16 messages🔥):

  • Hyperparameter Adjustment
  • Multimodal Input LLMs
  • Open-sourcing Models
  • RL Techniques in Inference
  • Inference on CPU

Nous Research AI ▷ #research-papers (4 messages):

  • Medical AI Research Papers
  • LLM Models in Healthcare
  • AI Ethics in Medicine

Links mentioned:


Nous Research AI ▷ #interesting-links (13 messages🔥):

  • DisTrO AI Project
  • AI Server Rankings
  • Quantum Computing in Data Generation
  • EleutherAI Community
  • VPTQ Quantization Algorithm

Links mentioned:


Nous Research AI ▷ #research-papers (4 messages):

  • Medical AI Paper of the Week
  • New Medical LLMs
  • Frameworks and Methodologies for Healthcare AI
  • Medical LLM Applications
  • AI in Healthcare Ethics

Links mentioned:


Nous Research AI ▷ #reasoning-tasks (4 messages):

  • AGI speculation
  • Funding AGI development

Perplexity AI ▷ #general (182 messages🔥🔥):

  • Perplexity performance issues
  • Felo vs Perplexity comparison
  • API inconsistencies
  • Document uploading vs pasting
  • LaTeX formula discussion

Links mentioned:


Perplexity AI ▷ #sharing (16 messages🔥):

  • Insights into the Multiverse
  • Israel-Hezbollah conflict escalation
  • New AI design tools
  • Texas county AI applications
  • First Schizophrenia Med in 30 Years

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (2 messages):

  • PPLX API Integration Issues
  • Real Estate Listings

OpenRouter (Alex Atallah) ▷ #general (193 messages🔥🔥):

  • OpenRouter Rate Limits
  • Model Performance Issues
  • Translation Model Recommendations
  • Frontend Chat GUI Options
  • Gemini and Search Functionality

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (178 messages🔥🔥):

  • Flux Model Insights
  • Stable Diffusion Setup and Performance
  • Image Generation Techniques
  • Community Art Contributions
  • AI Art vs Human Art Debate

Links mentioned:


OpenAI ▷ #ai-discussions (105 messages🔥🔥):

  • Aider's Code Editing Capabilities
  • Regulations in the EU AI Bill
  • Video Translation Announcements
  • Using AI for Writing Assistance
  • Huawei ChatGPT Accessibility

Link mentioned: Aider LLM Leaderboards: Quantitative benchmarks of LLM code editing skill.


OpenAI ▷ #gpt-4-discussions (28 messages🔥):

  • GPT-4.5-o Release
  • Advanced Voice Mode Limitations
  • Custom GPTs and Voice Mode
  • Payment Plans for Voice Features

OpenAI ▷ #prompt-engineering (5 messages):

  • Flutter Code Assistant Issues
  • Managing Assistant Runs
  • Prompt Management

OpenAI ▷ #api-discussions (5 messages):

  • Flutter Code Error
  • Thread Management
  • Prompt Management

Eleuther ▷ #general (90 messages🔥🔥):

  • Introduction of new members
  • ICLR and NeurIPS events coordination
  • Liquid AI's Foundation Models
  • Dengue fever in Singapore
  • Open source LLM training

Links mentioned:


Eleuther ▷ #research (45 messages🔥):

  • Process Reward Models
  • Value Functions in RL
  • Sparsity Masks in LLMs
  • Swarm LLM Architecture
  • Physics Simulation with Equivariant Representations

Links mentioned:


Eleuther ▷ #lm-thunderdome (2 messages):

  • lm-evaluation-harness library
  • vLLM model metrics

Eleuther ▷ #multimodal-general (1 messages):

  • ExecuTorch information
  • Multimodal models guidance
  • Hardware setup inquiries

Link mentioned: executorch: On-device AI across mobile, embedded and edge for PyTorch


Torchtune ▷ #general (95 messages🔥🔥):

  • Training Issues with Torchtune
  • Dynamic Recipe CLI for Torchtune
  • Efficiency of VRAM vs GPU Utilization
  • Setting up Error Handling in Distributed Training
  • Improving Config Management for CLI Arguments

Links mentioned:


Torchtune ▷ #dev (39 messages🔥):

  • Config Management Concerns
  • Performance Optimization Ideas
  • Documentation Improvements
  • Model Implementation Techniques
  • Memory Optimization Strategies

Links mentioned:


Latent Space ▷ #ai-general-chat (66 messages🔥🔥):

  • CodiumAI Series A Funding
  • Liquid Foundation Models Launch
  • AI Voice Interaction with Gradio
  • Ultralytics YOLO11 Release
  • OpenAI Pricing Comparisons

Links mentioned:


Latent Space ▷ #ai-announcements (6 messages):

  • New Podcast Episode
  • YouTube Engagement
  • AI Researchers on the Show

Links mentioned:


Latent Space ▷ #ai-in-action-club (42 messages🔥):

  • AI Engineering Interview
  • Screen Share Issues
  • Local Model Experiments
  • Braintrust Evaluation Platforms

LlamaIndex ▷ #blog (7 messages):

  • FinanceAgentToolSpec
  • Streaming events from workflows
  • Automated Financial Report Generation
  • Multi-Agent Slackbot with Confluence
  • LlamaParse Premium

LlamaIndex ▷ #general (105 messages🔥🔥):

  • Ollama concurrency
  • LlamaIndex project setup
  • RAG pipeline evaluation
  • Node metadata handling
  • Oracle retrieval in RAG Benchmark

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

  • LLM Reasoning
  • Different Types of Reasoning

Cohere ▷ #discussions (21 messages🔥):

  • Channel posting guidelines
  • Humanoid Robots 2024 YouTube video
  • Innovations in UI/UX for LLMs
  • Robotics development challenges
  • Podcasting as a UI/UX interaction

Links mentioned:


Cohere ▷ #questions (36 messages🔥):

  • RAG formatting queries
  • Cohere startup program
  • API billing questions
  • Multimodal captioning
  • Input token number concerns

Links mentioned:


Cohere ▷ #api-discussions (23 messages🔥):

  • Fine-tuning Models
  • Chunking Data for Improved Output
  • System Message and API Migration Issues
  • Documentation Consistency
  • V1 to V2 Chat API Transition

Cohere ▷ #projects (2 messages):

  • Cultural Multilingual LMM Benchmark
  • Volunteer Translators
  • CVPR'2025 Paper Co-Authorship

Interconnects (Nathan Lambert) ▷ #news (36 messages🔥):

  • OpenAI staff turnover
  • AI regulations
  • Legal decisions on AI datasets
  • Investment discussions
  • Public reactions to AI bills

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (16 messages🔥):

  • PearAI controversy
  • Yann LeCun on research standards
  • OpenAI's transparency debate
  • Peer review critique
  • Research blog impact

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (13 messages🔥):

  • iPhone IAP subscriptions
  • Apple App Store management
  • Twitter security issues
  • Meeting with John Schulman
  • Community engagement on Twitter

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (3 messages):

  • AI Memes
  • User Reactions

Interconnects (Nathan Lambert) ▷ #posts (1 messages):

SnailBot News: <@&1216534966205284433>


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (36 messages🔥):

  • Course Material Access
  • Multi-Agent Systems Discussion
  • NotebookLM Inquiry
  • Training Schedule Inquiry
  • Research Proposal Discussion

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (1 messages):

metakingkal: There is an example in Autogen site on how to build an Agent to play chess.


tinygrad (George Hotz) ▷ #general (27 messages🔥):

  • Cloud Storage Costs
  • Modal Pricing Structure
  • Tinygrad Matcher Optimization
  • Testing Strategies for Optimizers
  • Bounty Payment Methods

Link mentioned: Plan Pricing: Simple, transparent pricing that scales based on the amount of compute you use.


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

  • SOTA GPU for Bounties
  • Renting GPUs Online
  • TF32 Tensor Core Support
  • Learning Before Tackling Bounties
  • Small PR Contributions

OpenAccess AI Collective (axolotl) ▷ #general (14 messages🔥):

  • Llama 3.2 1b tuning
  • California AI training bill
  • Lightweight chat models
  • Liquid AI
  • Sample packing effects

OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (3 messages):

  • LoRA+ implementation
  • Learning Rate Default Values
  • PEFT's Implementation

OpenAccess AI Collective (axolotl) ▷ #axolotl-help-bot (12 messages🔥):

  • Axolotl dataset configuration
  • Selecting random dataset samples
  • Hugging Face datasets handling

Links mentioned:


DSPy ▷ #show-and-tell (2 messages):

  • Pydantic model generator
  • Groq integration
  • GitHub Actions
  • Typed Predictors
  • DSPyGen

Link mentioned: I am still King of Typed Output in DSPy: In this video, I demonstrate the creation of type predictors in Pydantic, showcasing the process and outcomes of generating structured text. I walk through the steps of creating a type predictor gener...


DSPy ▷ #general (17 messages🔥):

  • DSPy 2.5 & LM Client Upgrade
  • Miprov2 Status & Issues
  • Optimizing System Prompts in DSPy

DSPy ▷ #examples (8 messages🔥):

  • OpenSearchRetriever for DSPy
  • Healthcare Fraud Classification
  • Long Docstring Confusion
  • Using GPT-4o Mini and Claude Models

OpenInterpreter ▷ #general (5 messages):

  • Full-stack development
  • AI execution instructions
  • Open Interpreter functionalities

OpenInterpreter ▷ #O1 (9 messages🔥):

  • Error decoding packet
  • Connection issues with client
  • Ngrok error

Links mentioned:


OpenInterpreter ▷ #ai-content (2 messages):

  • Open Interpreter impact
  • Using Jan with Open Interpreter
  • Local LLMs interface

Links mentioned:


LAION ▷ #general (8 messages🔥):

  • French Audio Dataset for CosyVoice
  • LAION Copyright Challenge
  • Phenaki Video Generation Model
  • Visual Language Models and Latent Diffusion Models
  • PALM-RLHF Datasets and Task Implementation

Links mentioned:


LAION ▷ #research (7 messages):

  • Transformer Models
  • Positional Encodings
  • RoPE in Attention Layers
  • Convergence Time in Training

Link mentioned: Emu3: no description found


LangChain AI ▷ #general (6 messages):

  • Vectorstores interaction
  • Database usage for LLMs
  • Thank you gifts in Discord
  • Image errors in Gemini
  • Modifying inference method in LangChain

MLOps @Chipro ▷ #events (3 messages):

  • AI Realized Summit 2024
  • Manifold Research Frontiers Series
  • MLOps meetups in Stockholm

Links mentioned:


MLOps @Chipro ▷ #general-ml (1 messages):

zachmayer: Surya


DiscoResearch ▷ #general (3 messages):

  • Anti-slop Sampler
  • Dataset Creation

Link mentioned: GitHub - sam-paech/antislop-sampler: Contribute to sam-paech/antislop-sampler development by creating an account on GitHub.


Mozilla AI ▷ #announcements (1 messages):

  • SoraSNS
  • Takiyoshi Hoshida
  • Carnegie-Melon University
  • Apple's AR Kit

Gorilla LLM (Berkeley Function Calling) ▷ #leaderboard (1 messages):

  • Hammer handle update
  • Hammer2.0 series models
  • Pull Request submission



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}