Frozen AI News archive

Perplexity starts Shopping for you

**Stripe** launched their Agent SDK, enabling AI-native shopping experiences like **Perplexity Shopping** for US Pro members, featuring one-click checkout and free shipping via the **Perplexity Merchant Program**. **Mistral AI** released the **Pixtral Large 124B** multi-modal image model, now on **Hugging Face** and supported by **Le Chat** for image generation. **Cerebras Systems** offers a public inference endpoint for **Llama 3.1 405B** with a 128k context window and high throughput. **Claude 3.6** shows improvements over **Claude 3.5** but with subtle hallucinations. The **Bi-Mamba** 1-bit architecture improves LLM efficiency. The **wandb SDK** is preinstalled on Google Colab, and **Pixtral Large** is integrated into **AnyChat** and supported by **vLLM** for efficient model usage.

Canonical issue URL

AI News for 11/18/2024-11/19/2024. We checked 7 subreddits, 433 Twitters and 30 Discords (217 channels, and 1912 messages) for you. Estimated reading time saved (at 200wpm): 253 minutes. You can now tag @smol_ai for AINews discussions!

Just 2 days after Stripe launched their Agent SDK (our coverage here), Perplexity is now launching their in-app shopping experience for US-based Pro members. This is the first at-scale AI-native shopping experience, closer to Google Shopping (done well) than Amazon. The examples show the kind of queries you can make with natural language that would be difficult in traditional ecommerce UI:

image.png

image.png

The new "Buy With Pro" program comes with one-click checkout with "select merchants" (! more on this later) and free shipping.

Snap to Shop is also a great visual ecommerce idea... but it remains to be seen how accurate it really is from people who don't work at Perplexity.

image.png

The Buy With Pro program is almost certainly tied to the new Perplexity Merchant Program, which is a standard free data-for-recommendations value exchange.

Both Patrick Collison and Jeff Weinstein were quick to note Stripe's involvement, though both stopped short of directly saying that Perplexity Shopping uses the exact agent SDK that Stripe just shipped.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Releases and Performance

AI Tools, SDKs, and Platforms

AI Research and Benchmarks

AI Company Partnerships and Announcements

AI Events and Workshops

Memes/Humor

AI Applications and Use Cases

AI Community and General Discussions


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Mistral Large 2411: Anticipation and Release Details

Theme 2. Llama 3.1 405B Inference: Breakthrough with Cerebras

Theme 3. AMD GPUs on Raspberry Pi: Llama.cpp Integration

Theme 4. txtai 8.0: Streamlined Agent Framework Launched

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. Flux vs SD3.5: Community Prefers Flux Despite Technical Tradeoffs

Theme 2. O2 Robotics Breakthrough: 400% Speed Increase at BMW Factory

Theme 3. Claude vs ChatGPT: Enterprise User Experience Discussion

Theme 4. CogVideo Wrapper Updated: Major Refactoring and 1.5 Support


AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1: Cutting-Edge AI Models Claim Superiority


Theme 2: AI Models Grapple with Limitations and Bugs


Theme 3: Innovative Research Lights Up the AI Horizon


Theme 4: AI Tools Evolve and Optimize Workflows


Theme 5: Community Buzzes with Events and Big Moves



PART 1: High level Discord summaries

Eleuther Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


HuggingFace Discord


Stability.ai (Stable Diffusion) Discord


aider (Paul Gauthier) Discord


LM Studio Discord


OpenRouter (Alex Atallah) Discord


Notebook LM Discord Discord


Nous Research AI Discord


GPU MODE Discord


Interconnects (Nathan Lambert) Discord


Latent Space Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Cohere Discord


OpenInterpreter Discord


DSPy Discord


OpenAccess AI Collective (axolotl) Discord


Modular (Mojo 🔥) Discord


LLM Agents (Berkeley MOOC) Discord


Torchtune Discord


LAION Discord


Mozilla AI Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Eleuther ▷ #general (73 messages🔥🔥):

  • Cerebras technology speculation
  • Pre-NeurIPS meetup
  • Hyperparameter tuning tools
  • NovelAI service for uncensored LLMs
  • Llama-2 70B performance

Links mentioned:


Eleuther ▷ #research (46 messages🔥):

  • Muon optimizer comparisons
  • Neural Metamorphosis

Links mentioned:


Eleuther ▷ #scaling-laws (12 messages🔥):

  • Scaling laws in LLMs
  • LLM pretraining scalability
  • Financial considerations in scaling
  • Capabilities prediction in AI
  • Research on observational scaling laws

Link mentioned: Scaling realities: Both stories are true. Scaling still works. OpenAI et al. still have oversold their promises.


Eleuther ▷ #interpretability-general (1 messages):

  • SAE feature steering
  • AI safety research
  • Phi-3 Mini model performance
  • Collaboration with Microsoft
  • Jailbreak robustness

Links mentioned:


Eleuther ▷ #lm-thunderdome (65 messages🔥🔥):

  • lm_eval config.json issue
  • pawsx and headqa errors
  • glue evaluation metrics
  • headqa performance update
  • RWKV model preparation

OpenAI ▷ #ai-discussions (87 messages🔥🔥):

  • Model Updates
  • AI Image Editing Tools
  • Scrolling Issues on Devices
  • Using Python for Infographics
  • LLM Application Evaluation

Link mentioned: Add-it: no description found


OpenAI ▷ #gpt-4-discussions (10 messages🔥):

  • Game Bots Behavior
  • Temperature Effect on AI
  • Tic Tac Toe GPT Strategy

OpenAI ▷ #prompt-engineering (44 messages🔥):

  • AI performance in Tic Tac Toe
  • Challenges with LLMs and state machines
  • User frustrations with AI strategy
  • Game logging and move tracking
  • Difficulty parameters in AI gameplay

OpenAI ▷ #api-discussions (44 messages🔥):

  • AI Tic-Tac-Toe Bot
  • Blocking Strategy Issues
  • Model Limitations
  • State Machine Representation
  • Difficulty Parameters

Unsloth AI (Daniel Han) ▷ #general (135 messages🔥🔥):

  • Qwen 2.5 model issues
  • Unsloth Training FAQs
  • Multiple turn conversation fine-tuning
  • Utilizing Azure for training
  • Reinforcement Learning from Human Feedback (RLHF)

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (41 messages🔥):

  • Using CSV for Chat Models
  • Aya Expanse Support
  • Model Finetuning Issues
  • Container Installation Problems
  • Unsloth Trainer Compatibility

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (1 messages):

  • Synthetic Data in Language Models
  • AgentInstruct

HuggingFace ▷ #announcements (1 messages):

  • Pipeline Abstraction for Vision Models
  • New Methods in Diffusers
  • Qwen 2.5 with Extended Context
  • Pixtral Large Support
  • CO2 Calculations on Open LLM Leaderboard

Links mentioned:


HuggingFace ▷ #general (139 messages🔥🔥):

  • Gradio API Quota Issues
  • Hub-Stats Dataset and Rankings
  • Synthetic Data Generation
  • NeurIPS Conference
  • Zero-Shot Classification with Hugging Face Hub

Links mentioned:


HuggingFace ▷ #today-im-learning (3 messages):

  • EMA Scaling
  • Neural Network in Rust

Link mentioned: Scaling EMA: Like 👍. Comment 💬. Subscribe 🟥.🏘 Discord: https://discord.gg/pPAFwndTJdhttps://arxiv.org/pdf/2307.13813.pdfhttps://huggingface.co/papers/2307.13813#machi...


HuggingFace ▷ #cool-finds (5 messages):

  • Exact Unlearning in LLMs
  • HuggingFace Feature Update
  • RAG Fusion in Generative AI
  • Voice Data Augmentation for Whisper
  • Chat Collapsing Feature

Link mentioned: UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI: Exact unlearning was first introduced as a privacy mechanism that allowed a user to retract their data from machine learning models on request. Shortly after, inexact schemes were proposed to mitigate...


HuggingFace ▷ #i-made-this (12 messages🔥):

  • Augment Inference Engine
  • LLM Ranking Arena
  • Response Generator Challenges
  • Qwen2-VL on ONNX Runtime
  • PTA-1 GUI Element Localization

Links mentioned:


HuggingFace ▷ #reading-group (9 messages🔥):

  • GPU performance comparison
  • Water cooling issues
  • NVIDIA vs AMD for AI
  • Radeon Instinct MI50 and MI60

HuggingFace ▷ #core-announcements (1 messages):

  • LoRA Model Support
  • New methods for LoRA

HuggingFace ▷ #computer-vision (1 messages):

  • Object Detection for Videos
  • Oil and Gas Frac Site Analysis
  • Labeling Challenges in Object Detection

Stability.ai (Stable Diffusion) ▷ #general-chat (134 messages🔥🔥):

  • Mochi and CogVideo Performance
  • Model Recommendations for Beginners
  • Using different WebUIs for Stable Diffusion
  • GGUF Format vs. Large Model
  • AI-driven News Content Creation Software

Links mentioned:


aider (Paul Gauthier) ▷ #general (83 messages🔥🔥):

  • OpenAI o1 models update
  • Issues with qwen-2.5-coder
  • Kubernetes editing with Aider
  • Anthropic rate limits changes
  • Model performance comparison

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (47 messages🔥):

  • Aider and OpenAI API limitations
  • Connecting Aider with local models
  • Benchmark test skips
  • Using extra_params in Aider
  • Aider with Bedrock models

Links mentioned:


aider (Paul Gauthier) ▷ #links (3 messages):

  • Pixtral Large Release
  • Deno Project Discussion
  • Mistral Large 2 Performance
  • Aider Benchmarking

Link mentioned: Pixtral Large: Pixtral grows up.


LM Studio ▷ #general (65 messages🔥🔥):

  • 7900XTX Graphics Performance
  • Roleplay Models for Llama 3.2
  • Remote Server Usage for LM Studio
  • Hosting Local LLMs
  • Model Updates in LM Studio

Links mentioned:


LM Studio ▷ #hardware-discussion (67 messages🔥🔥):

  • Windows vs Ubuntu Inference Speed
  • AMD GPU Performance Challenges
  • RTX 4090 Configuration Options
  • Benchmarking Results on AMD W7900

Link mentioned: Don't ask to ask, just ask: no description found


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

  • Activity page outage

OpenRouter (Alex Atallah) ▷ #general (119 messages🔥🔥):

  • O1 Preview and Streaming Support
  • Gemini Model Issues
  • Mistral API Limitations
  • OpenRouter Error Reports
  • Developer Requests and Suggestions

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (5 messages):

  • Custom Provider Keys
  • Beta Custom Provider Keys
  • Bring Your Own API Keys

Notebook LM Discord ▷ #use-cases (20 messages🔥):

  • Audio Track Separations
  • Video Creation Tools
  • NotebookLM for Document Organization
  • Teaching with NotebookLM
  • Podcast Experimentation with Code

Links mentioned:


Notebook LM Discord ▷ #general (101 messages🔥🔥):

  • NotebookLM UI Confusion
  • Data Source Limitations
  • Using NotebookLM for Studying
  • Mobile Access and App Updates
  • Podcast Generation Feature

Link mentioned: AI Note Taking & Transcribe & Summarizer | AI Notebook App: Generate transcripts and AI summarize for College Students in lectures. Specializing in YouTube Video Summarizer, PDF Summarizer, Article Summarizer. Save key insights and review with study guides, qu...


Nous Research AI ▷ #general (78 messages🔥🔥):

  • Recent AI/ML Research Papers
  • AI Model Feedback
  • Opportunities for High School Students in AI/ML
  • Security Concerns in Job Postings
  • Forge API Access Requests

Links mentioned:


Nous Research AI ▷ #research-papers (5 messages):

  • LLM2CLIP
  • Neural Metamorphosis
  • AgentInstruct

Link mentioned: Neural Metamorphosis: This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks. Contrary to crafting separate models for different architecture...


Nous Research AI ▷ #research-papers (5 messages):

  • LLM2CLIP
  • Neural Metamorphosis
  • AgentInstruct
  • Cross-modal representation
  • Synthetic data generation

Link mentioned: Neural Metamorphosis: This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks. Contrary to crafting separate models for different architecture...


Nous Research AI ▷ #reasoning-tasks (2 messages):

  • LLaVA-o1
  • Vision-Language Models
  • Reasoning capabilities
  • Visual question-answering
  • Inference-time scaling

Link mentioned: LLaVA-o1: Let Vision Language Models Reason Step-by-Step: Large language models have demonstrated substantial advancements in reasoning capabilities, particularly through inference-time scaling, as illustrated by models such as OpenAI's o1. However, curr...


GPU MODE ▷ #general (2 messages):

  • Colexicographical Order
  • Cutlass/Cute API Behavior

Link mentioned: cutlass/media/docs/cute/01_layout.md at main · NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines. Contribute to NVIDIA/cutlass development by creating an account on GitHub.


GPU MODE ▷ #triton (5 messages):

  • Triton spelling
  • Triton CPU backend
  • GitHub Pull Request

Link mentioned: Add Triton CPU as an Inductor backend by int3 · Pull Request #133408 · pytorch/pytorch: Stack from ghstack (oldest at bottom): -> Add Triton CPU as an Inductor backend #133408 The goal is to use Inductor-generated kernels to stress test the new Triton CPU backend. cc @XilunWu @H...


GPU MODE ▷ #torch (17 messages🔥):

  • DCP Saving Mechanics
  • FSDP Memory Allocations
  • State Dict Analysis
  • Transformer Block Auto-Wrap Policy
  • Future of FSDP Improvements

Links mentioned:


GPU MODE ▷ #beginner (8 messages🔥):

  • Rewriting Aten kernels
  • Kernel fusion benefits
  • Torch.compile limitations

GPU MODE ▷ #youtube-recordings (1 messages):

  • FP8 and FP32 MMA alignment
  • Warp shuffle performance issues
  • Static layout permutation

GPU MODE ▷ #off-topic (8 messages🔥):

  • Travel Tips
  • Flight Searching Tools
  • OpenAI o1 Technical Discussion
  • YouTube Content
  • Discount Travel Strategies

Link mentioned: Speculations on Test-Time Scaling (o1): Tutorial on the technical background behind OpenAI o1. Talk written with Daniel Ritter.Slides: https://github.com/srush/awesome-o1Talk: The “large” in LLM is...


GPU MODE ▷ #liger-kernel (8 messages🔥):

  • Strange Model Outputs
  • Liger Kernel Distillation Loss
  • Kaggle Collaborations

Link mentioned: [RFC] Liger FlexChunkLoss: Alignment and Distillation loss · Issue #371 · linkedin/Liger-Kernel: 🚀 The feature, motivation and pitch We want to support various alignment and distillation loss functions. Refer this PR on ORPO: #362 Progress Alignment ORPO #362 CPO #382 DPO #378 SimPO #386 IRPO .....


GPU MODE ▷ #self-promotion (1 messages):

  • NVIDIA Virtual Connect with Experts
  • CUDA Core Compute Libraries

Links mentioned:


GPU MODE ▷ #🍿 (16 messages🔥):

  • Scheduler Development
  • Modal Integration
  • Remote Authentication

Link mentioned: modal branch by msaroufim · Pull Request #25 · gpu-mode/discord-cluster-manager: Still has an annoying bug where modal is trying to run the bot code itself instead of the train.py Logging statements though show that the filename and contents are correct And I know a toy example...


GPU MODE ▷ #thunderkittens (2 messages):

  • Register Allocation
  • Spill Prevention Strategies
  • Nsight Compute Profiling

Interconnects (Nathan Lambert) ▷ #news (7 messages):

  • Runner H Navigation Skills
  • Pixtral Paper Discussion
  • Runner H Performance Evaluation
  • Runner H Beta Release
  • Comparisons with Qwen

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (52 messages🔥):

  • Tulu Discussions
  • Google Employee Perspectives
  • Grok Updates
  • Threelu Naming Idea
  • Time Zone Challenges

Links mentioned:


Interconnects (Nathan Lambert) ▷ #reads (3 messages):

  • Tree Search Gains
  • Tree of Thoughts
  • Q* Algorithm

Link mentioned: Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search: Recently, test-time scaling has garnered significant attention from the research community, largely due to the substantial advancements of the o1 model released by OpenAI. By allocating more computati...


Latent Space ▷ #ai-general-chat (44 messages🔥):

  • Cerebras Inference Performance
  • OpenAI Voice Features
  • Roboflow Series B Funding
  • Small Language Models

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

  • Local Agentic RAG Application
  • LlamaIndex Workflows
  • Llama-Deploy
  • LlamaIndex Azure Integration
  • Microsoft Ignite

LlamaIndex ▷ #general (34 messages🔥):

  • Document Processing in S3
  • RAG App Functionality
  • SQLAutoVectorQueryEngine Citations
  • Chat History in RAG
  • Iterating on Prompts

Links mentioned:


LlamaIndex ▷ #ai-discussion (5 messages):

  • RAG systems
  • Agent invocation strategies
  • Preventing spam in channels

tinygrad (George Hotz) ▷ #general (25 messages🔥):

  • tinygrad 0.10.0 Release
  • Testing Failures on Architectures
  • Meeting Access and Notes
  • Kernel Cache in Action Test
  • Interpolation Test on ARM

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):

  • DEBUG output
  • Jitted functions behavior

Cohere ▷ #discussions (13 messages🔥):

  • Tokenized Training in AI Models
  • Name and Symbol Recognition
  • Language Model Training Challenges
  • APIs and Language Response Settings

Cohere ▷ #announcements (1 messages):

  • Beta Program for Cohere Tool
  • Research and Writing Tool
  • User Feedback Importance

Link mentioned: Research Prototype - Early Beta Sign Up Form: Thank you for your interest in participating in the beta testing phase of our research prototype — a tool designed to help users tackle research and writing tasks such as: creating complex reports, do...


Cohere ▷ #questions (6 messages):

  • Rate limits in API
  • Language setting for models

OpenInterpreter ▷ #general (8 messages🔥):

  • Development Branch Status
  • Open Interpreter Skills Generation
  • UI Simplifications
  • Claude Model Issues

OpenInterpreter ▷ #ai-content (12 messages🔥):

  • 10 AI Tools Podcast
  • Tool Use Podcast
  • Server Etiquette
  • Community Engagement

Links mentioned:


DSPy ▷ #show-and-tell (1 messages):

  • DSPy VLM tutorial
  • Attribute extraction from images

Link mentioned: Tweet from Karthik Kalyanaraman (@karthikkalyan90): 🧵DSPy recently added support for VLMs in beta. A quick thread on attributes extraction from images using DSPy. For this example, we will see how to extract useful attributes from screenshots of websi...


DSPy ▷ #general (17 messages🔥):

  • DSPy integration with non-Python
  • Cost reduction with DSPy
  • Challenges with long-context prompts
  • Testing DSPy code for React agents
  • DSPy assertions compatibility with MIRPOv2

OpenAccess AI Collective (axolotl) ▷ #general (14 messages🔥):

  • Mistral Large / Pixtral models
  • MI300X training
  • bitsandbytes integration
  • Web3 platform job openings

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (1 messages):

faldore: <@257999024458563585> did you implement this one yet?

https://arxiv.org/abs/2410.05258


OpenAccess AI Collective (axolotl) ▷ #announcements (1 messages):

  • Axolotl v0.5.2 release
  • Optimizer support
  • Upgraded dependencies
  • FSDP gradient accumulation fix

OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (2 messages):

  • Phorm Bot deprecation
  • Repository URL issues

Modular (Mojo 🔥) ▷ #max (12 messages🔥):

  • Max Graphs and Knowledge Graphs
  • Using MAX for Graph Search
  • Graph RAG System
  • Mojo and Max Agent Implementation

Link mentioned: GitHub - microsoft/graphrag: A modular graph-based Retrieval-Augmented Generation (RAG) system: A modular graph-based Retrieval-Augmented Generation (RAG) system - microsoft/graphrag


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

  • Google AI Workshop
  • Gemini Integration
  • Hackathon Insights

Link mentioned: Workshop with Google AI: Building with Gemini for the LLM Agents MOOC Hackathon · Luma: Workshop with Google AI: Building with Gemini for the LLM Agents MOOC Hackathon About the Workshop Join us for an exclusive workshop at the LLM Agents MOOC…


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

  • Lecture 10 Announcement
  • Percy Liang's Presentation
  • Open-Source Foundation Models
  • Course Logistics

Link mentioned: CS 194/294-196 (LLM Agents) - Lecture 10, Percy Liang: no description found


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (7 messages):

  • Non-English models
  • State of the art performance
  • Low data point challenges

Torchtune ▷ #general (9 messages🔥):

  • Flex Attention Limitations
  • Attention Score Hacks
  • Vanilla Attention Strategies

Links mentioned:


LAION ▷ #announcements (1 messages):

  • LAION-DISCO-12M
  • YouTube samples for ML

Link mentioned: Tweet from LAION (@laion_ai): We announce LAION-DISCO-12M - a collection of 12 million links to publicly available YouTube samples paired with metadata to support basic machine learning research in foundation models for generic au...


Mozilla AI ▷ #announcements (1 messages):

  • Transformer Lab Demo
  • Metadata Filtering
  • Refact AI
  • Autonomous AI Agents





{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}