Frozen AI News archive

OpenAI Realtime API and other Dev Day Goodies

**OpenAI** launched the **gpt-4o-realtime-preview** Realtime API featuring text and audio token processing with pricing details and future plans including vision and video support. The API supports voice activity detection modes, function calling, and ephemeral sessions with auto-truncation for context limits. Partnerships with **LiveKit**, **Agora**, and **Twilio** enhance audio components and AI virtual agent voice calls. Additionally, OpenAI introduced vision fine-tuning with only 100 examples improving mapping accuracy for **Grab** and RPA success for **Automat**. Model distillation and prompt caching features were also announced, including free eval inference for users opting to share data.

Canonical issue URL

AI News for 9/30/2024-10/1/2024. We checked 7 subreddits, 433 Twitters and 30 Discords (220 channels, and 2056 messages) for you. Estimated reading time saved (at 200wpm): 223 minutes. You can now tag @smol_ai for AINews discussions!

As widely rumored for OpenAI Dev Day, OpenAI's new Realtime API debuted today as gpt-4o-realtime-preview with a nifty demo showing a voice agent function calling a mock strawberry store owner:

image.png

Available in Playground and SDK. Notes from the blogpost:

From docs:

On top of Realtime, they also announced:

image.png

Additional Resources:


AI News Pod: We have regenerated the NotebookLM recap of today's news, plus our own clone. The codebase is now open source!


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Developments and Industry Updates

AI Research and Technical Discussions

AI Industry Trends and Commentary

Memes and Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. New Open-Source LLM Frameworks and Tools

Theme 2. Advancements in Running LLMs Locally on Consumer Hardware

Theme 3. Addressing LLM Output Quality and 'GPTisms'

Theme 4. LLM Performance Benchmarks and Comparisons

Theme 5. New LLM and Multimodal AI Model Releases

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Research and Techniques

AI Model Releases and Improvements

AI Safety and Ethics Concerns

AI Applications and Demonstrations


AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1: OpenAI's Dev Day Unveils Game-Changing Features

Theme 2: New AI Models Turn Up the Heat

Theme 3: AI Tools and Techniques Level Up

Theme 4: Community Debates on AI Safety and Ethics Intensify

Theme 5: Engineers Collaborate and Share to Push Boundaries


PART 1: High level Discord summaries

Nous Research AI Discord


GPU MODE Discord


aider (Paul Gauthier) Discord


LM Studio Discord


Unsloth AI (Daniel Han) Discord


HuggingFace Discord


OpenRouter (Alex Atallah) Discord


Interconnects (Nathan Lambert) Discord


Eleuther Discord


OpenAI Discord


Stability.ai (Stable Diffusion) Discord


Latent Space Discord


Cohere Discord


Perplexity AI Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Torchtune Discord


Modular (Mojo 🔥) Discord


DSPy Discord


OpenAccess AI Collective (axolotl) Discord


OpenInterpreter Discord


LangChain AI Discord


LAION Discord


MLOps @Chipro Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Nous Research AI ▷ #general (321 messages🔥🔥):

  • OpenAI Dev Day
  • Voice API Costs
  • Model Comparisons
  • Training LLMs
  • Unified Token Space

Links mentioned:


Nous Research AI ▷ #ask-about-llms (6 messages):

  • Together API for Llama 3.2
  • Vector databases for multimodal LLMs

Nous Research AI ▷ #interesting-links (1 messages):

rikufps: https://openai.com/index/api-model-distillation/


Nous Research AI ▷ #reasoning-tasks (6 messages):

  • NPC Mentality
  • AI Business Claims
  • Market-Based AGI Development

Link mentioned: Dr Phil Hair Loss GIF - Dr Phil Hair Loss Wig - Discover & Share GIFs: Click to view the GIF


GPU MODE ▷ #triton (4 messages):

  • Link Access Issues
  • Internal URL Shortener

Link mentioned: triton/python/triton/compiler/compiler.py at main · triton-lang/triton: Development repository for the Triton language and compiler - triton-lang/triton


GPU MODE ▷ #torch (22 messages🔥):

  • PyTorch 2.x Inference Recommendations
  • Pipeline Parallel Training
  • 3xTF32 Matrix Multiplication
  • AOTI and Libtorch Runtime
  • No Libtorch Compile Project

Links mentioned:


GPU MODE ▷ #cool-links (14 messages🔥):

  • Mirage Superoptimizer
  • Tiramisu Transformations
  • GPU Kernel Generation with Triton
  • PyTorch Conference Recordings
  • Modular MAX GPU Integration

Links mentioned:


GPU MODE ▷ #torchao (1 messages):

drisspg: This is correct


GPU MODE ▷ #off-topic (10 messages🔥):

  • NotebookLM performance
  • Escalation in the Middle East
  • Political discussions in Discord

Link mentioned: Tweet from Kuldar ⟣ (@kkuldar): Someone gave NotebookLM a document with just "poop" and "fart" repeated over and over again. I did NOT expect the result to be this good.


GPU MODE ▷ #llmdotc (144 messages🔥🔥):

  • Llama3 Attention Bug Fix
  • Gradient Norm Differences
  • Performance Comparison
  • BF16 Optimizer State Implementation
  • Chunked Softmax for Large Context Lengths

Links mentioned:


GPU MODE ▷ #bitnet (11 messages🔥):

  • Llama3 Training Run
  • Gradient Norm Issues
  • Learning Rate Schedulers
  • Frozen Embeddings
  • Mini Distilled Models

Links mentioned:


GPU MODE ▷ #liger-kernel (4 messages):

  • Gemma2 convergence test
  • Qwen2-VL tests re-enabling
  • CI test fix
  • Beta configuration PR

Link mentioned: Disable gemma2 and qwen2_vl tests by shimizust · Pull Request #288 · linkedin/Liger-Kernel: Summary Gemma2 convergence tests were erroneously passing before due to all tensors having NaN values. Using attn_implementation="eager" fixes the NaNs, but results don't pa...


GPU MODE ▷ #diffusion (24 messages🔥):

  • flux.cpp implementation
  • Triton usage challenges
  • CUDA vs Triton performance
  • Memory consumption comparison
  • Autograd considerations

Link mentioned: Google Colab: no description found


GPU MODE ▷ #nccl-in-triton (5 messages):

  • Memory Consistency Models
  • IRL Hackathon GitHub Repo
  • Materials Development

Link mentioned: GitHub - cchan/tccl: extensible collectives library in triton: extensible collectives library in triton. Contribute to cchan/tccl development by creating an account on GitHub.


aider (Paul Gauthier) ▷ #general (148 messages🔥🔥):

  • Aider Image & Document Support
  • OpenAI DevDay Announcements
  • Architect and Editor Model Usage
  • Prompt Caching
  • Refactoring with Aider

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (70 messages🔥🔥):

  • Aider Usage and Features
  • Node.js and Aider
  • Architect Mode Performance
  • Refactoring Benchmark Insights
  • Configuration Comparisons

Links mentioned:


aider (Paul Gauthier) ▷ #links (6 messages):

  • Whisper large-v3-turbo model
  • OpenAI DevDay
  • Model Performance
  • Speech-to-Text Accuracy

Link mentioned: Whisper large-v3-turbo model: It’s OpenAI DevDay today. Last year they released a whole stack of new features, including GPT-4 vision and GPTs and their text-to-speech API, so I’m intrigued to see wha...


LM Studio ▷ #general (92 messages🔥🔥):

  • Qwen Benchmarking Performance
  • Questioning Model Quantization Loss
  • Embedding Model Limitations
  • RAG Setup with LM Studio
  • Model Differences and Recommendations

Links mentioned:


LM Studio ▷ #hardware-discussion (87 messages🔥🔥):

  • GPU vs CPU performance
  • VRAM offload
  • Beelink SER9
  • Llama 3.1 and 3.2 performance
  • AI model configuration issues

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (122 messages🔥🔥):

  • Fine-tuning Llama 3.2
  • LoRA Dropout
  • RAG and text classification
  • Quantization in training
  • Dataset quality considerations

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (37 messages🔥):

  • Pinning Important Messages
  • Quantization Challenges with Llama
  • Continous-Pretraining (CPT) with Llama Models
  • VLLMs and Unsloth Integration
  • Errors Loading Models with Hugging Face

Link mentioned:

  How to Apply BERT to Arabic and Other Languages · Chris McCormick

: no description found


Unsloth AI (Daniel Han) ▷ #research (3 messages):

  • Mirage superoptimizer
  • Tensor program optimization

Link mentioned: A Multi-Level Superoptimizer for Tensor Programs: We introduce Mirage, the first multi-level superoptimizer for tensor programs. A key idea in Mirage is $μ$Graphs, a uniform representation of tensor programs at the kernel, thread block, and thread le...


HuggingFace ▷ #announcements (1 messages):

  • Llama 3.2 Release
  • Transformers v4.45.0
  • Whisper Turbo
  • Pixtral-12B
  • HuggingChat for macOS

Links mentioned:


HuggingFace ▷ #general (113 messages🔥🔥):

  • Innovative Business Models with Generative AI
  • Challenges with LLM Tuning
  • Community GPU Grant Applications
  • Hugging Face Space Issues
  • Chinese AI Global Expansion

Links mentioned:


HuggingFace ▷ #today-im-learning (4 messages):

  • Custom GPT Authentication Issues
  • Alternatives in Development Tools
  • Flutter and Dart for Android Development
  • Challenges with Python Mobile Tools

HuggingFace ▷ #cool-finds (5 messages):

  • Projection Mapping Software
  • Pika 1.5 Release
  • Spam Note

Link mentioned: Tweet from Pika (@pika_labs): Sry, we forgot our password. PIKA 1.5 IS HERE. With more realistic movement, big screen shots, and mind-blowing Pikaffects that break the laws of physics, there’s more to love about Pika than ever be...


HuggingFace ▷ #i-made-this (24 messages🔥):

  • RAG Applications
  • WebLLM Playground
  • NotebookLM Video
  • Badge Systems
  • Thermal Dynamics Experiment

Links mentioned:


HuggingFace ▷ #reading-group (1 messages):

  • User Study on ML Developers
  • Privacy-Preserving Models

HuggingFace ▷ #NLP (9 messages🔥):

  • Learning SageMaker
  • Channel Moderation

HuggingFace ▷ #diffusion-discussions (2 messages):

  • Diffusion Models
  • Hiring Discussions
  • Channel Usage Guidelines

HuggingFace ▷ #gradio-announcements (1 messages):

  • Gradio 5 Beta feedback
  • Gradio 5 features
  • Gradio 5 Docs and Guides
  • Security warning
  • Installation steps

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (3 messages):

  • Gemini Flash Ratlimits
  • Liquid 40B Model
  • Samba Nova Collaboration
  • Gemini Token Standardization
  • Cohere Model Updates

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (4 messages):

  • Mem0 Toolkit
  • Long-term memory for AI apps
  • Integration of memory features
  • OpenRouter API

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (134 messages🔥🔥):

  • OpenAI DevDay announcements
  • Nova Model Launch
  • SambaNova Context Limitations
  • OpenRouter Payment Methods
  • LLM Translation Capabilities

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (24 messages🔥):

  • Liquid AI Models
  • OpenAI DevDay Updates
  • Evaluation Sharing

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (52 messages🔥):

  • AI Safety and Ethics Discussions
  • Barret Zoph's Departure from OpenAI
  • Impact of Capitalism on AI Ethics
  • Self-Driving Cars vs AI Models
  • Concerns about AI Doomerism

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (19 messages🔥):

  • Joining Anthropic
  • Security Concerns
  • FrieNDAs in SF
  • RLHF Discussions

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (4 messages):

  • Andy Barto at RLC 2024
  • Standing Ovation for Andrew Barto
  • YouTube video on ML and RL

Links mentioned:


Interconnects (Nathan Lambert) ▷ #reads (1 messages):

natolambert: excited to watch this tbf https://www.youtube.com/watch?v=b1-OuHWu88Y


Eleuther ▷ #general (15 messages🔥):

  • 3D Interactive Scatter Plots
  • Liquid Foundation Models
  • Neural Architecture and Bayesian Statistics

Link mentioned: Tweet from Liquid AI (@LiquidAI_): Today we introduce Liquid Foundation Models (LFMs) to the world with the first series of our Language LFMs: A 1B, 3B, and a 40B model. (/n)


Eleuther ▷ #research (52 messages🔥):

  • Refusal Directions Paper
  • VAE for Video Models
  • Delta Frames in Video Compression
  • Wavelet Coefficients for Training
  • Neural Codec and Compression Algorithms

Eleuther ▷ #lm-thunderdome (6 messages):

  • Evaluation Benchmarks
  • Open-ended Benchmarks
  • Using Together.ai
  • OpenAI Chat LLMs and Logprogs

OpenAI ▷ #ai-discussions (50 messages🔥):

  • AI Writing Drafts
  • Understanding LLMs
  • AI Image Generator Market
  • Suno Music AI
  • SearchGPT and Perplexity Pro

Link mentioned: Chasing the Storm typ2 by @dragomaster08 | Suno: electronic pop song. Listen and make your own with Suno.


OpenAI ▷ #gpt-4-discussions (9 messages🔥):

  • AI using real names
  • Voice mode testing
  • Bot errors in product
  • Disappearing responses
  • Update issues

OpenAI ▷ #prompt-engineering (4 messages):

  • Advanced voice prompts
  • Virtual workforce generation
  • Voice design parameters
  • Character backstory in prompts

OpenAI ▷ #api-discussions (4 messages):

  • Advanced Voice Prompts
  • Virtual Workforce Generation
  • Voice Model Parameters

Stability.ai (Stable Diffusion) ▷ #general-chat (66 messages🔥🔥):

  • AI Generation Prompting Techniques
  • VRAM Management in Generative Models
  • Software and Model Compatibility
  • Stable Diffusion UI Insights
  • Community Support and Resources

Link mentioned: Discord - Group Chat That’s All Fun & Games: Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.


Latent Space ▷ #ai-general-chat (56 messages🔥🔥):

  • Wispr Flow Launch
  • AI Grant Batch 4
  • Whisper v3 Turbo Model
  • Kingma's New Role at Anthropic
  • Entropy-Based Sampling Framework

Links mentioned:


Cohere ▷ #discussions (20 messages🔥):

  • Community Greetings
  • Paperspace Cookie Preferences

Cohere ▷ #announcements (2 messages):

  • RAG Course Launch
  • Radical AI Founders Masterclass
  • AI Entrepreneurship
  • Cohere RAG Techniques
  • Compute Resources for AI

Links mentioned:


Cohere ▷ #questions (2 messages):

  • Cohere on Azure
  • Cohere Model Issues
  • API Performance

Cohere ▷ #api-discussions (32 messages🔥):

  • V2 Support on Cloud Providers
  • Performance Issues with Command R Plus
  • Temporary Context Window Caveat
  • Trial Key Limitations

Perplexity AI ▷ #general (37 messages🔥):

  • Perplexity Pro Subscription
  • Gemini Pro Features
  • API Key Issues
  • AI for Children
  • Dark Mode Display Problems

Links mentioned:


Perplexity AI ▷ #sharing (8 messages🔥):

  • Nvidia's Acquisition Spree
  • Bionic Eye Development
  • AI Model Selection
  • Flying with Pets
  • Sunglasses Myths

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (1 messages):

  • API features
  • Structured outputs

LlamaIndex ▷ #announcements (1 messages):

  • Embedding Fine-tuning
  • NUDGE approach
  • RAG performance
  • Webinar announcement

Link mentioned: LlamaIndex Webinar: NUDGE Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval · Zoom · Luma: Fine-tuning your embedding model is an underrated way of increasing RAG performance - come learn about it! We're excited to host the authors of NUDGE (Sepanta…


LlamaIndex ▷ #blog (4 messages):

  • LlamaIndex for TypeScript
  • Embedding model fine-tuning
  • Multimodal RAG
  • Contextual Retrieval RAG

LlamaIndex ▷ #general (35 messages🔥):

  • Twitter Chatbot Integration
  • GithubRepositoryReader Issues
  • Embedding Model Applications
  • RAG-based Chatbot Chunking Strategies
  • LlamaIndex and Ollama Integration

Links mentioned:


tinygrad (George Hotz) ▷ #general (30 messages🔥):

  • OpenCL and Metal on macOS
  • Tech Debt in Software Development
  • Tinygrad Meeting Recap
  • Issues with GPT2 Example
  • Slurm Support for Tinygrad

Links mentioned:


Torchtune ▷ #general (4 messages):

  • tyro package dependency
  • CLI communication improvements
  • custom help behavior

Torchtune ▷ #dev (24 messages🔥):

  • bitsandbytes and CUDA
  • MPS support concerns
  • H200 hardware setup for LLMs
  • Inference with local infrastructure
  • Compliance with European health data

Link mentioned: bitsandbytes/bitsandbytes/functional.py at 0500c31fe2c7e3b40f6910bcc5a947240e13d3f2 · bitsandbytes-foundation/bitsandbytes: Accessible large language models via k-bit quantization for PyTorch. - bitsandbytes-foundation/bitsandbytes


Modular (Mojo 🔥) ▷ #general (22 messages🔥):

  • Modular Community Meeting
  • Modular Wallpapers

Link mentioned: Modular Community Meeting #8: MAX driver & engine APIs, Magic AMA, and Unicode support in Mojo: In this community meeting, Jakub introduced us to the MAX Driver Python and Mojo APIs, which provide a unified interface for interacting with CPUs and GPUs, ...


DSPy ▷ #general (10 messages🔥):

  • Using different models in MIPRO
  • Freezing Programs and Encapsulation

DSPy ▷ #examples (1 messages):

  • Diagnosis risk adjustment
  • Under-coded diagnosis

OpenAccess AI Collective (axolotl) ▷ #general (7 messages):

  • China's AI Training Breakthrough
  • Liquid Foundation Models
  • Nvidia's 72B Model
  • Qwen 2.5 34B Deployment

Links mentioned:


OpenInterpreter ▷ #general (3 messages):

  • AI Script Generation
  • Voice Assistants Integration

OpenInterpreter ▷ #O1 (1 messages):

  • Full-stack Development
  • E-commerce Platforms
  • JavaScript Ecosystem
  • React Native
  • PineScript Development

OpenInterpreter ▷ #ai-content (2 messages):

  • Realtime API
  • Fine-Tuning API
  • Prompt Caching
  • Model Distillation
  • AI Tools Development

Links mentioned:


LangChain AI ▷ #general (1 messages):

  • OpenAI applications
  • User prompt optimization
  • System prompt limitations

LangChain AI ▷ #share-your-work (2 messages):

  • PDF to podcast maker
  • Nova LLM Release
  • LumiNova image generation

Link mentioned: Tweet from Rubiks AI (@RubiksAI): 🚀 Introducing Nova: The Next Generation of LLMs by Nova! 🌟 We're thrilled to announce the launch of our latest suite of Large Language Models: Nova-Instant, Nova-Air, and Nova-Pro. Each designe...


LangChain AI ▷ #tutorials (1 messages):

jasonzhou1993: https://youtu.be/2PjmPU07KNs Cursor best practices that no one is talking about...


LAION ▷ #general (3 messages):

  • Open Datasets Contributions
  • AI Challenge Game
  • YouTube Video Share

Link mentioned: LLM Jailbreak: no description found


MLOps @Chipro ▷ #events (1 messages):

  • Agent Security Hackathon
  • AI agents safety
  • Virtual event details
  • Collaboration and mentorship

MLOps @Chipro ▷ #general-ml (1 messages):

  • Nova Large Language Models
  • MMLU Benchmarking
  • LumiNova Image Generation

Link mentioned: Tweet from Rubiks AI (@RubiksAI): 🚀 Introducing Nova: The Next Generation of LLMs by Nova! 🌟 We're thrilled to announce the launch of our latest suite of Large Language Models: Nova-Instant, Nova-Air, and Nova-Pro. Each designe...







{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}