Frozen AI News archive

nothing much happened today

**OpenAI's o1 model** faces skepticism about open-source replication due to its extreme restrictions and unique training advances like RL on CoT. **ChatGPT-4o** shows significant performance improvements across benchmarks. **Llama-3.1-405b** fp8 and bf16 versions perform similarly with cost benefits for fp8. A new open-source benchmark "Humanity's Last Exam" offers $500K in prizes to challenge LLMs. Model merging benefits from neural network sparsity and linear mode connectivity. Embedding-based toxic prompt detection achieves high accuracy with low compute. **InstantDrag** enables fast, optimization-free drag-based image editing. **LangChain v0.3** releases with improved dependency management. Automated code review tool **CodeRabbit** adapts to team coding styles. Visual search advances integrate multimodal data for better product search. Experts predict AI will be default software by 2030.

Canonical issue URL

AI News for 9/16/2024-9/17/2024. We checked 7 subreddits, 433 Twitters and 30 Discords (221 channels, and 2197 messages) for you. Estimated reading time saved (at 200wpm): 225 minutes. You can now tag @smol_ai for AINews discussions!

Given the extreme restrictions, cost, and lack of transparency around o1, everyone has puts and takes on whether or not o1 can be replicated in open source/in the wild. As discussed in /r/localLlama, Manifold markets currently has 63% odds on an open source version:

image.png

It is simultaneously likely that:

For the last reason alone, the standard time-to-OSS-equivalent curves in model development may not apply in this instance.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Updates and Advancements

AI Development and Research

AI Tools and Applications

Industry Trends and Observations


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Advancements in Model Compression and Quantization

Theme 2. Open-Source LLMs Closing the Gap with Proprietary Models

Theme 3. Developments in LLM Reasoning and Inference Techniques

Theme 4. Challenges in LLM Evaluation and Reliability

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Model Advancements and Benchmarks

AI Ethics and Societal Impact

AI Development and Research

AI Tools and Applications


AI Discord Recap

A summary of Summaries of Summaries

O1-mini

Theme 1. AI Models: New Releases and Rivalries

Theme 2. Innovative Tools and Integrations

Theme 3. Training, Optimization, and Technical Hurdles

Theme 4. AI Safety and Ethical Concerns

Theme 5. Community Buzz and Funding Moves


O1-preview

Theme 1. New AI Models and Releases Ignite Tech Communities

Theme 2. AI Tools Get Superpowers: Integrations Galore

Theme 3. Tech Gremlins and Solutions: Overcoming AI Hurdles

Theme 4. AI Safety and Research Take Center Stage

Theme 5. AI Ventures into Business and Creativity


PART 1: High level Discord summaries

Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


aider (Paul Gauthier) Discord


LM Studio Discord


HuggingFace Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


CUDA MODE Discord


Nous Research AI Discord


Latent Space Discord


Eleuther Discord


Stability.ai (Stable Diffusion) Discord


LlamaIndex Discord


Interconnects (Nathan Lambert) Discord


LangChain AI Discord


tinygrad (George Hotz) Discord


Cohere Discord


DSPy Discord


LAION Discord


OpenInterpreter Discord


Torchtune Discord


Modular (Mojo 🔥) Discord


OpenAccess AI Collective (axolotl) Discord


MLOps @Chipro Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Perplexity AI ▷ #general (389 messages🔥🔥):

  • O1 Mini limits
  • Comparison of AI models
  • Perplexity features and functionalities
  • Integration with other services
  • Promise of Pro Search enhancements

Links mentioned:


Perplexity AI ▷ #sharing (14 messages🔥):

  • Minecraft Moderation Ban Issue
  • Microsoft's Strategy
  • Research Topic Discussions
  • Global AI Center Strength
  • Bitcoin's 66-bit Puzzle

Link mentioned: YouTube: no description found


Unsloth AI (Daniel Han) ▷ #general (280 messages🔥🔥):

  • Qwen 2.5 Model Release
  • Mistral-Small-Instruct-2409
  • Unsloth Installation Issues
  • Fine-tuning LLMs
  • Backus-Naur Form (BNF)

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (7 messages):

  • Job Application Strategies
  • Importance of Application Knowledge
  • Coding Style Compliance
  • Fundamental Understanding in Code
  • Limitations of LeetCode

Unsloth AI (Daniel Han) ▷ #help (15 messages🔥):

  • Model fine-tuning
  • Mac compatibility issues
  • Gratitude in the community

Link mentioned: Kaggle Llama 3.1 8b Conversational Unsloth: Explore and run machine learning code with Kaggle Notebooks | Using data from No attached data sources


Unsloth AI (Daniel Han) ▷ #research (23 messages🔥):

  • KTO vs PPO
  • Domain Adaptation of Llama-3.1-8B
  • Continued Pre-Training vs Full Fine-Tuning
  • GPU Limitations
  • Unsloth Support

aider (Paul Gauthier) ▷ #general (238 messages🔥🔥):

  • O1 Models Performance
  • Aider and Sonnet Integration
  • RAG and Fine-tuning for Codebases
  • Use Cases for Different Models
  • Feedback on Flux Canvas Art

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (45 messages🔥):

  • Aider configuration
  • Azure OpenAI integration
  • User story implementation
  • Streaming output metrics
  • OpenRouter outages

Links mentioned:


aider (Paul Gauthier) ▷ #links (8 messages🔥):

  • Superflex AI Assistant
  • Claude 3.5 Artifacts
  • RethinkMCTS Algorithm
  • Code Integration from Figma
  • Optillm for Inference Proxy

Links mentioned:


LM Studio ▷ #general (154 messages🔥🔥):

  • GPU and Performance Issues
  • Model Training Challenges
  • New Features in LM Studio
  • Community Tools and Extensions
  • User Experience and Settings

Links mentioned:


LM Studio ▷ #hardware-discussion (46 messages🔥):

  • GPU Recommendations for LM Studio
  • Dual GPU Setups
  • Used GPU Market Insights
  • Intel ARC Performance
  • VRAM Importance in LLMs

HuggingFace ▷ #announcements (1 messages):

  • Hugging Face API docs
  • TRL v0.10 release
  • Sentence Transformers v3.1
  • DataCraft for synthetic datasets
  • Core ML Segment Anything 2

Links mentioned:


HuggingFace ▷ #general (135 messages🔥🔥):

  • Short-form video tools
  • Hugging Face Inference API updates
  • FSDP GPU usage
  • CogvideoX img2vid capabilities
  • Hugging Face SQL Console launch

Links mentioned:


HuggingFace ▷ #today-im-learning (5 messages):

  • Learning Manim
  • ML Data Pipelines with PyTorch
  • Hugging Face Dataset Issues

Link mentioned: jonasmaltebecker/synthetic_drilling_dataset · Datasets at Hugging Face: no description found


HuggingFace ▷ #cool-finds (11 messages🔥):

  • Inference API Documentation Improvements
  • Model Growth and Downloads Discussion
  • AI Community Engagement

Link mentioned: Tweet from Wauplin (@Wauplin): I'm thrilled to unveil our revamped Inference API docs! We've tackled your feedback head-on: clearer rate limits, dedicated PRO section, better code examples, and detailed parameter lists for ...


HuggingFace ▷ #i-made-this (5 messages):

  • Behavioral Biometric Recognition in Minecraft
  • PowershAI Multilingual Documentation
  • Nvidia Mini-4B Model Release
  • HuggingFace Agent Registration
  • Continuous MFA and Ban Evasion Detection

Links mentioned:


HuggingFace ▷ #NLP (5 messages):

  • Downloading LLaMA3
  • Using PyTorch
  • MLIR Conversion Tool

HuggingFace ▷ #gradio-announcements (1 messages):

  • Gradio Office Hours

Link mentioned: Discord - Group Chat That’s All Fun & Games: Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.


OpenAI ▷ #ai-discussions (99 messages🔥🔥):

  • GPT-4o Performance
  • Alpha Rollouts
  • AI Implementation in Businesses
  • Custom GPTs for Code Snippets
  • LLM Benchmarks

Link mentioned: OpenAI Status: no description found


OpenAI ▷ #gpt-4-discussions (26 messages🔥):

  • Fine-tuning Limitations
  • Advanced Voice Mode Availability
  • Custom GPT Sharing
  • Token Refresh Confusion

OpenAI ▷ #prompt-engineering (3 messages):

  • Auto prompt for ideogram/midjourney
  • Prompt sharing practice
  • Library resources

OpenAI ▷ #api-discussions (3 messages):

  • Auto prompt for ideogram/midjourney
  • Official libraries

OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

  • OpenRouter integration
  • Google Sheets Addon Features
  • Updates and Improvements
  • User Feedback
  • Support for Multiple Models

Link mentioned: GPT Unleashed for Sheets™ - Google Workspace Marketplace: no description found


OpenRouter (Alex Atallah) ▷ #general (117 messages🔥🔥):

  • OpenRouter API Issues
  • Gemini Image Generation
  • Prompt Caching Usage
  • Mistral API Price Drops
  • Model Performance and Ratings

Links mentioned:


CUDA MODE ▷ #general (14 messages🔥):

  • Metal discussion group
  • ZML project insights
  • Zig programming language
  • ATen in Zig
  • CUDA support in Zig

Links mentioned:


CUDA MODE ▷ #triton (9 messages🔥):

  • Triton Developer Conference
  • Proton Tutorial
  • Triton CPU/ARM Development
  • Shoutout at Keynote
  • CUDA Community

Links mentioned:


CUDA MODE ▷ #algorithms (2 messages):

  • Flash Attention v2 with learnable bias
  • BitBlas and Triton-like language

Link mentioned: BitBLAS/testing/python/tilelang/test_tilelang_dequantize_gemm.py at main · microsoft/BitBLAS: BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment. - microsoft/BitBLAS


CUDA MODE ▷ #cool-links (1 messages):

  • SK Hynix AiMX-xPU
  • In-Memory Computing
  • LLM Inference
  • Power Efficiency

Link mentioned: SK Hynix AI-Specific Computing Memory Solution AiMX-xPU at Hot Chips 2024: SK Hynix showed off its AiMX-xPU concept at Hot Chips 2024 for more efficient LLM inference compute being done in-memory


CUDA MODE ▷ #beginner (2 messages):

  • Learning Custom CUDA Kernels
  • Neural Network Training

CUDA MODE ▷ #pmpp-book (3 messages):

  • Implementation using Metal or WebGPU
  • CUDA Alternatives
  • FAQs on GPU Programming
  • Metal Channel in Discord

CUDA MODE ▷ #off-topic (4 messages):

  • H100 purse
  • GH100 confusion
  • High pricing concerns

Link mentioned: H100 Purse: Purse that has a rare one of a kind gpt-4 training gpu.    This purse is subject to export controls. 


CUDA MODE ▷ #llmdotc (37 messages🔥):

  • RMSNorm Implementation
  • FP8 Stability Issues
  • Consistency between Python and C/CUDA
  • Llama 3 Token Support
  • Dynamic Threadgroup Sizing

Links mentioned:


CUDA MODE ▷ #bitnet (15 messages🔥):

  • BitNet efficiency
  • In-memory computing from SK Hynix
  • Ternary packing methods
  • Custom silicon for neural networks
  • Lookup tables for packing

Links mentioned:


CUDA MODE ▷ #cudamode-irl (9 messages🔥):

  • Hack Ideas Discussion
  • Point Cloud Registration Kernel
  • Meetup at PyTorch Conference
  • Student Pricing Inquiry

CUDA MODE ▷ #liger-kernel (1 messages):

  • Triton LayerNorm Issue
  • Tensor Parallelism and Training MoEs

CUDA MODE ▷ #metal (5 messages):

  • Metal Puzzles GitHub Repository
  • Live Puzzle Solving Session
  • Conferences

Link mentioned: GitHub - abeleinin/Metal-Puzzles: Solve Puzzles. Learn Metal 🤘: Solve Puzzles. Learn Metal 🤘. Contribute to abeleinin/Metal-Puzzles development by creating an account on GitHub.


Nous Research AI ▷ #general (66 messages🔥🔥):

  • NousCon inquiries
  • AI Model Hermes 3 usage
  • InstantDrag development
  • CPLX extension for perplexity
  • Jailbreaking Claude 3.5

Links mentioned:


Nous Research AI ▷ #ask-about-llms (18 messages🔥):

  • Parameter vs RAM Estimates
  • Model Training Data Efficiency
  • Scaling Parameters and Tokens
  • Optimal Compute Usage in LLMs
  • Llama Models Data Scaling

Nous Research AI ▷ #research-papers (3 messages):

  • Scaling LLM Inference
  • Chunking Phases in Research

Link mentioned: Tweet from Denny Zhou (@denny_zhou): What is the performance limit when scaling LLM inference? Sky's the limit. We have mathematically proven that transformers can solve any problem, provided they are allowed to generate as many int...


Nous Research AI ▷ #interesting-links (3 messages):

  • ChatGPT o1-preview
  • RL in development environments
  • iText2KG and SeekTopic Algorithm
  • LLMs generating research ideas

Link mentioned: Tweet from George Hotz 🌑 (@realGeorgeHotz): ChatGPT o1-preview is the first model that's capable of programming (at all). Saw an estimate of 120 IQ, feels about right. Very bullish on RL in development environments. Write code, write tests...


Nous Research AI ▷ #research-papers (3 messages):

  • Scaling LLM inference limits
  • Chunking phases in research
  • Transformer capabilities

Link mentioned: Tweet from Denny Zhou (@denny_zhou): What is the performance limit when scaling LLM inference? Sky's the limit. We have mathematically proven that transformers can solve any problem, provided they are allowed to generate as many int...


Latent Space ▷ #ai-general-chat (77 messages🔥🔥):

  • Dream Machine API
  • 11x AI Series A funding
  • Impact of AI on jobs
  • Claude 3.5 system prompt
  • ZIG based inference stack

Links mentioned:


Eleuther ▷ #general (10 messages🔥):

  • Foundation Models in Biotech
  • AI Safety Transition
  • TensorRT-LLM Issues
  • Transformer Model Memory Profiling
  • Photo Upgrade Tools

Eleuther ▷ #research (8 messages🔥):

  • AI Safety Fellowship
  • Token Embedding Variability
  • Multi-head Low-Rank Attention
  • Diagram of Thought
  • Hyper-graphs

Links mentioned:


Eleuther ▷ #interpretability-general (9 messages🔥):

  • Fourier Transforms of Hidden States
  • Power Law in Hidden States
  • Pythia Checkpoints Exploration
  • Untrained Model Behavior
  • Attention Residuals Analysis

Link mentioned: interpreting GPT: the logit lens — LessWrong: This post relates an observation I've made in my work with GPT-2, which I have not seen made elsewhere. …


Eleuther ▷ #lm-thunderdome (37 messages🔥):

  • Issue with LM Evaluation Harness
  • Integration of Torchtune
  • TensorRT-LLM Build Errors
  • Deployment of Datasets on Hugging Face
  • Chain of Thought Prompting

Links mentioned:


Eleuther ▷ #gpt-neox-dev (1 messages):

  • Model Outputs
  • Library Utilization

Stability.ai (Stable Diffusion) ▷ #general-chat (53 messages🔥):

  • SSH Connection Issues
  • Stable Diffusion Installation Errors
  • ComfyUI White Screen
  • Control Net Training Challenges
  • CivitAI Bounty Offer

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

  • Multimodal RAG techniques
  • LlamaCloud launch
  • Product manual challenges

LlamaIndex ▷ #general (36 messages🔥):

  • LlamaIndex and Neo4j Integration
  • Embedding Retrieval from Neo4j
  • Circular Dependency in LlamaIndex Packages
  • GraphRAG Implementation
  • Image Coordinate Extraction with GPT-4o

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (4 messages):

  • Mistral's September Release
  • Free Tier on La Plateforme
  • Pricing Update
  • Mistral Small Improvements
  • Vision Capabilities with Pixtral 12B

Link mentioned: AI in abundance: Introducing a free API, improved pricing across the board, a new enterprise-grade Mistral Small, and free vision capabilities on le Chat.


Interconnects (Nathan Lambert) ▷ #ml-questions (12 messages🔥):

  • Intermediate Generation in Transformers
  • Visualizing Attention Matrices
  • Alpha Code Website Feature
  • Attention Rollout Paper
  • Gradient-Based Token Associations

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (15 messages🔥):

  • Gemini Models
  • NotebookLM Tweet
  • Podcast with Riley
  • Guest Lecture on LLMs

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (2 messages):

  • AI developers skipping Google's Gemini
  • Humorous AI article

Interconnects (Nathan Lambert) ▷ #posts (3 messages):

  • Newsletter Reader Party
  • Mainstream Media Critique

LangChain AI ▷ #general (15 messages🔥):

  • Chat Message History Management
  • UI Messages Storage
  • Open Source Aspirations
  • Migrating to LLMChain
  • Implementing AI in Business

Link mentioned: Migrating from LLMChain | 🦜️🔗 LangChain: LLMChain combined a prompt template, LLM, and output parser into a class.


LangChain AI ▷ #langserve (1 messages):

taixian0420: please dm me


LangChain AI ▷ #share-your-work (6 messages):

  • RAG Chatbot
  • AdaletGPT

Link mentioned: no title found: no description found


LangChain AI ▷ #tutorials (1 messages):

  • RAG Chatbot
  • OpenAI Integration
  • LangChain Framework

Link mentioned: no title found: no description found


tinygrad (George Hotz) ▷ #general (10 messages🔥):

  • tinygrad version bump
  • ROCm compatibility
  • CRIU feature in AMDKFD
  • pytest filtering
  • testing unnecessary files

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (8 messages🔥):

  • VRAM allocation spikes
  • Tinygrad Tensor error
  • Diffusers fork with Tinygrad
  • NotebookLM podcast
  • Fundamental operations in Tinygrad

Link mentioned: Issues · tinygrad/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ❤️ - Issues · tinygrad/tinygrad


Cohere ▷ #discussions (10 messages🔥):

  • Cohere Chat API Safety Modes
  • Cohere's market strategy
  • Training language models
  • Applying to Cohere

Cohere ▷ #questions (1 messages):

  • Fine-tuning models
  • Dataset management
  • Cohere platform capabilities

Link mentioned: Datasets — Cohere: The document provides an overview of the Dataset API, including file size limits, data retention policies, dataset creation, validation, metadata preservation, using datasets for fine-tuning models, d...


Cohere ▷ #api-discussions (4 messages):

  • Sagemaker Client Issues
  • Cohere Support

DSPy ▷ #show-and-tell (3 messages):

  • GitHub Responses
  • CodeBlueprint with Aider
  • Ruff Check Errors

DSPy ▷ #general (11 messages🔥):

  • GPT-4 Vision API wrapper
  • Contributions and Bounties
  • Documentation Needs
  • DSPy Program API Flexibility

Link mentioned: Add GPT-4 Vision API wrapper by jmanhype · Pull Request #682 · stanfordnlp/dspy: Introduce a new GPT4Vision class in visionopenai.py that wraps the GPT-4 Vision API. This abstraction layer simplifies the process of making requests to the API for analyzing images. Key functional...


LAION ▷ #general (10 messages🔥):

  • Image Compositing Techniques
  • Pillow Library for Image Processing
  • Text Integration in Images
  • Creative Process with Nouswise
  • Whisper Speech Support

Links mentioned:


LAION ▷ #research (1 messages):

mkaic: https://mistral.ai/news/pixtral-12b/


OpenInterpreter ▷ #general (5 messages):

  • Open Interpreter updates
  • Beta testing inquiry
  • 01 app functionality

OpenInterpreter ▷ #O1 (2 messages):

  • Human Device Discord Event
  • Beta Availability Inquiry

OpenInterpreter ▷ #ai-content (2 messages):

  • Tool Use Podcast
  • 01 Voices Script
  • Voice Agents in Group Conversations
  • Deepgram Local Version

Link mentioned: The Future of Voice Agents with Killian Lucas - Ep 5 - Tool Use: Join us for this week's episode of Tool Use as we dive into the exciting world of voice intelligence. We're joined by special guest Killian Lucas, creator of...


Torchtune ▷ #dev (5 messages):

  • Eleuther Eval Recipe
  • Cache Management
  • Model Generation Issues

Modular (Mojo 🔥) ▷ #general (2 messages):

  • RISC-V support

Modular (Mojo 🔥) ▷ #mojo (2 messages):

  • Zero-copy data interoperability
  • Mandelbrot example
  • LLVM intrinsics in Mojo

Link mentioned: Mandelbrot in Mojo with Python plots | Modular Docs: Learn how to write high-performance Mojo code and import Python packages.


OpenAccess AI Collective (axolotl) ▷ #general (2 messages):

  • Shampoo in Transformers
  • Liger usage
  • Shampoo Scaling Law
  • Performance of Shampoo
  • Shampoo vs Adam

Link mentioned: Tweet from Simo Ryu (@cloneofsimo): Shampoo Scaling law for language model Plot taste of Kaplan et al, but comparing shampoo and adam. Shampoo is literally such a free lunch, in large scale, in predictable manner.


MLOps @Chipro ▷ #events (1 messages):

  • YOLO Vision 2024
  • Ultralytics Event
  • Google Campus for Startups






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}