Frozen AI News archive

o1 API, 4o/4o-mini in Realtime API + WebRTC, DPO Finetuning

**OpenAI** launched the **o1 API** with enhanced features including vision inputs, function calling, structured outputs, and a new `reasoning_effort` parameter, achieving **60% fewer reasoning tokens** on average. The **o1 pro** variant is confirmed as a distinct implementation coming soon. Improvements to the **Realtime API** with **WebRTC** integration offer easier usage, longer sessions (up to **30 minutes**), and significantly reduced pricing (up to **10x cheaper** with mini models). **DPO Preference Tuning** for fine-tuning is introduced, currently available for the **4o** model. Additional updates include official Go and Java SDKs and OpenAI DevDay videos. The news also highlights discussions on **Google Gemini 2.0 Flash** model's performance reaching **83.6% accuracy**.

Canonical issue URL

AI News for 12/16/2024-12/17/2024. We checked 7 subreddits, 433 Twitters and 32 Discords (210 channels, and 4050 messages) for you. Estimated reading time saved (at 200wpm): 447 minutes. You can now tag @smol_ai for AINews discussions!

It was a mini dev day for OpenAI, with a ton of small updates and one highly anticipated API launch. Let's go in turn:

o1 API

image.png

Minor notes:

image.png

o1 pro is CONFIRMED to "be a different implementation and not just o1 with high reasoning_effort setting." and will be available in API in "some time".

WebRTC and Realtime API improvements

It's a lot easier to work with the RealTime API with WebRTC now that it fits in a tweet (try it out on SimwonW's demo with your own keys)):

image.png

New 4o and 4o-mini models, still in preview:

image.png

Justin Uberti, creator of WebRTC who recently joined OpenAI, also highlighted a few other details

DPO Preference Tuning

It's Hot or Not, but for finetuning. We aim to try this out for AINews ASAP... although it seems to only be available for 4o.

image.png

image.png

Misc

Selected OpenAI DevDay videos were also released.

Official Go and Java SDKs for those who care.

The team also did an AMA (summary here, nothing too surprising).

The full demo is worth a watch:

https://www.youtube.com/watch?v=14leJ1fg4Pw


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

Here's a categorized summary of the key discussions and announcements:

Model Releases and Performance

Research and Technical Developments

Company Updates

Memes and Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Falcon 3 Emerges with Impressive Token Training and Diversified Models

Theme 2. Nvidia's Jetson Orin Nano: A Game Changer for Embedded Systems?

Theme 3. ZOTAC Announces GeForce RTX 5090 with 32GB GDDR7: High-End Potential for AI

Theme 4. DavidAU's Megascale Mixture of Experts LLMs: A Creative Leap

Theme 5. Llama.cpp GPU Optimization: Snapdragon Laptops Gain AI Performance Boost

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. Steak Off Challenge Emphasizes Google's Lead in AI Video Rendering

Theme 2. Gemini 2.0 Flash Model Enriches AI with Advanced Roleplay and Context Capabilities


AI Discord Recap

A summary of Summaries of Summaries by O1-mini

Theme 1. AI Models Battle for Supremacy

Theme 2. AI Tools Struggle with Pricing and Integration

Theme 3. Optimizing AI Deployments and Hardware Utilization

Theme 4. AI Enhancements in Developer Workflows

Theme 5. Community Events and Educational Initiatives Drive Innovation


PART 1: High level Discord summaries

Codeium / Windsurf Discord


Nous Research AI Discord


aider (Paul Gauthier) Discord


Notebook LM Discord Discord


Unsloth AI (Daniel Han) Discord


Cohere Discord


Bolt.new / Stackblitz Discord


Eleuther Discord


Cursor IDE Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


Latent Space Discord


LM Studio Discord


Stability.ai (Stable Diffusion) Discord


GPU MODE Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


Nomic.ai (GPT4All) Discord


OpenInterpreter Discord


Torchtune Discord


LLM Agents (Berkeley MOOC) Discord


tinygrad (George Hotz) Discord


Axolotl AI Discord


DSPy Discord


Mozilla AI Discord


MLOps @Chipro Discord


Gorilla LLM (Berkeley Function Calling) Discord


The LAION Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Codeium / Windsurf ▷ #discussion (89 messages🔥🔥):

Windsurf functionality issues, Codeium pricing and credits, User experiences with AI code generation, Codeium plugin display problems, Tool recommendations for code reviews

Links mentioned:


Codeium / Windsurf ▷ #windsurf (668 messages🔥🔥🔥):

Windsurf vs Cursor, Gemini AI performance, Windsurf bugs, Git usage, User experiences with AI tools

Links mentioned:


Nous Research AI ▷ #general (566 messages🔥🔥🔥):

AI and Creative Writing, Prompt Engineering and Evaluation, LLM Performance Characteristics, Educational Paths in Computer Science

Links mentioned:


Nous Research AI ▷ #ask-about-llms (6 messages):

Sampling Algorithms, Gemini Data Recall, Threefry, Mersenne Twister


Nous Research AI ▷ #research-papers (5 messages):

phi-4 language model, quantization techniques, LlamaCPP integration, test-time compute approaches, performance benchmarks

Link mentioned: Scaling test-time compute - a Hugging Face Space by HuggingFaceH4: no description found


Nous Research AI ▷ #research-papers (5 messages):

phi-4 model, quantized models, Hugging Face test-time compute

Link mentioned: Scaling test-time compute - a Hugging Face Space by HuggingFaceH4: no description found


aider (Paul Gauthier) ▷ #general (302 messages🔥🔥):

Aider Updates, O1 API and Pro Features, Linters and Code Management, Claude Model Discussion, AI in Coding Automation

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (34 messages🔥):

Aider with LM Studio, Using Aider with Emacs, Committing Specific Files in Aider, Troubleshooting Aider Errors, Dart Support in Aider

Links mentioned:


Notebook LM Discord ▷ #announcements (1 messages):

New UI Rollout, NotebookLM Plus Features, Interactive Audio BETA


Notebook LM Discord ▷ #use-cases (32 messages🔥):

Podcast Experiments, AI in Call Centers, Game Strategy Guides, Improved Note Exporting, Interactive Mode

Links mentioned:


Notebook LM Discord ▷ #general (207 messages🔥🔥):

Notebook LM Plus Access, Interactive Mode Feature, New UI Feedback, Audio Overview Limitations, Multi-language Support

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (183 messages🔥🔥):

Python version compatibility, Unsloth 4-bit model performance, Function calling in Llama 3.2, Multi-GPU training with Unsloth Pro, Quantization process for models

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (4 messages):

Voe2 vs Sora, Open Source Reasoning Models, OpenAI Bankruptcy Speculation


Unsloth AI (Daniel Han) ▷ #help (47 messages🔥):

Lora+ with Unsloth, Finetuning Qwen models, AMD GPU compatibility, Training vs Inference in Unsloth, Packing in training

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (2 messages):

phi-4 Language Model, Continual Pre-training Strategies

Link mentioned: Continual Pre-Training of Large Language Models: How to (re)warm your model?: Large language models (LLMs) are routinely pre-trained on billions of tokens, only to restart the process over again once new data becomes available. A much cheaper and more efficient solution would b...


Cohere ▷ #discussions (166 messages🔥🔥):

Code Wizard Hackathon, Command R7B Office Hours, Maya Release and Tool Use, Emotional Support for Projects, AI Models Discussion

Links mentioned:


Cohere ▷ #announcements (1 messages):

Multimodal Image Embed endpoint, Rate limit increase, API keys, Cohere pricing

Link mentioned: API Keys and Rate Limits — Cohere: This page describes Cohere API rate limits for production and evaluation keys.


Cohere ▷ #questions (57 messages🔥🔥):

Cohere API for Image Embeddings, RAG-based PDF Answering System, Image Retrieval through Metadata

Link mentioned: Multimodal Embeddings — Cohere: Multimodal embeddings convert text and images into embeddings for search and classification (API v2).


Cohere ▷ #cmd-r-bot (1 messages):

.kolynzb: yello


Bolt.new / Stackblitz ▷ #prompting (4 messages):

Meta prompt for Bolt, UI version of Bolt, Feature requests for Bolt

Links mentioned:


Bolt.new / Stackblitz ▷ #discussions (213 messages🔥🔥):

Using Bolt for SaaS projects, Challenges with Bolt integration, Support and assistance on coding issues, Managing tokens effectively, Sharing projects built with Bolt

Links mentioned:


Eleuther ▷ #general (43 messages🔥):

Pythia based RLHF models, TensorFlow on TPU v5p, Tokenizer edge cases, Exponential Moving Average (EMA), VLM Pretraining Data


Eleuther ▷ #research (101 messages🔥🔥):

Attention Mechanisms, Gradient Descent Optimizers, Grokking Phenomenon, Memory Augmented Neural Networks, Stick Breaking Attention

Links mentioned:


Eleuther ▷ #interpretability-general (3 messages):

Steering Vectors, SAEs and Interpretability, Unlearning with SAE Conditional Steering

Link mentioned: SAE features for refusal and sycophancy steering vectors — LessWrong: TL;DR * Steering vectors provide evidence that linear directions in LLMs are interpretable. Since SAEs decompose linear directions, they should be a…


Eleuther ▷ #lm-thunderdome (43 messages🔥):

VLLM performance, Winogrande dataset issues, New release updates, Library requirements for benchmarks, Chat template integration


Eleuther ▷ #gpt-neox-dev (2 messages):

Non-parametric LayerNorm, Configuration Options, Memory Recall on Config Changes


Cursor IDE ▷ #general (173 messages🔥🔥):

Cursor IDE updates, Issues with AI models, Cursor Extension announcement, User feedback on models, O1 Pro discussions

Links mentioned:


OpenAI ▷ #annnouncements (1 messages):

DevDay Holiday Edition, AMA with OpenAI's API Team

Link mentioned: - YouTube: no description found


OpenAI ▷ #ai-discussions (130 messages🔥🔥):

AI Accents and Realism, OpenAI Features and Limitations, AI Interaction and Alignment, Anthropic Pricing Changes, API Functionality Concerns

Links mentioned:


OpenAI ▷ #gpt-4-discussions (12 messages🔥):

Custom GPT issues, Advanced voice mode features, PDF and image reading functionality, Project replacements for Custom GPTs, Max file size and limit for Custom GPTs


OpenAI ▷ #prompt-engineering (3 messages):

Using AI to Build Websites, Exploring AI Capabilities


OpenAI ▷ #api-discussions (3 messages):

Using AI for web development, Maximizing AI model capabilities


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Structured Outputs, Multi-Model Apps, OpenRouter Model Support

Link mentioned: Tweet from OpenRouter (@OpenRouterAI): Structured outputs are very underrated. It's often much easier to constrain LLM outputs to a JSON schema than asking for a tool call.OpenRouter now normalizes structured outputs for- 46 models- 8 ...


OpenRouter (Alex Atallah) ▷ #app-showcase (2 messages):

``


OpenRouter (Alex Atallah) ▷ #general (130 messages🔥🔥):

Gemini Flash 2 performance, Using typos in prompts for model response, API Key Exposure, OpenRouter API limitations, o1 API changes and pricing

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):

Perplexity Pro gift subscriptions, Subscription benefits, Subscription durations

Link mentioned: Perplexity Pro Subscription | Perplexity Supply: Perplexity Supply exists to explore the relationship between fashion and intellect with thoughtfully designed products to spark conversations and showcase your infinite pursuit of knowledge.


Perplexity AI ▷ #general (90 messages🔥🔥):

OpenAI vs. Perplexity Features, Perplexity Pro Subscription, Model Performance Comparison, API Usage Guidance, User Interface Suggestions

Links mentioned:


Perplexity AI ▷ #sharing (6 messages):

Mozi social app, U.S. Military Space Wars, Mystery Georgian Tablet, Qu Kuai Lian, Creatine Monohydrate

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (2 messages):

Perplexity MCP, MCP server integration, Using models with APIs, Gemini integration, Access to models

Link mentioned: GitHub - pyroprompts/any-chat-completions-mcp: Contribute to pyroprompts/any-chat-completions-mcp development by creating an account on GitHub.


Latent Space ▷ #ai-general-chat (84 messages🔥🔥):

Palmyra Creative, OpenAI API updates, NVIDIA Jetson Orin Nano, O1 and O1 Pro distinction, Anthropic API updates

Links mentioned:


LM Studio ▷ #general (55 messages🔥🔥):

Text to Speech and Speech to Text Tools, LM Studio Model Tuning, Uncensored Chatbot Alternatives, Cooling and Power Management Scripts on macOS, Model Compatibility and Performance

Links mentioned:


LM Studio ▷ #hardware-discussion (29 messages🔥):

GPU Performance Comparison, Model Memory Usage in GPUs, Llama Model Settings, New GPU Listings, Driver Issues with AMD GPUs

Link mentioned: Zotac accidentally lists RTX 5090, RTX 5080, and RTX 5070 family weeks before launch — accidental listing seemingly confirms the RTX 5090 with 32GB of GDDR7 VRAM: Strike three for Zotac!


Stability.ai (Stable Diffusion) ▷ #general-chat (79 messages🔥🔥):

Learning Stable Diffusion with online courses, Choosing between GPU options for AI, Scams and bot detection methods, Creating Lora models, Using the latest models in AI

Links mentioned:


GPU MODE ▷ #general (7 messages):

Session Recording Availability, Distributed Training Courses, 6D Parallelism Insights, NCCL Source Code Study, Tool Calls during Generation

Links mentioned:


GPU MODE ▷ #cuda (7 messages):

CUDA Graph and cudaMemcpyAsync, Compute Throughput on 4090, Kernel vs cudaMemcpyAsync in cuda Graph


GPU MODE ▷ #torch (9 messages🔥):

Optimizing Docker Images for PyTorch, Conda vs Docker Usage, Building Custom Torch with Nix, Efficiency of Megatron-LM in Training

Link mentioned: no title found: no description found


GPU MODE ▷ #cool-links (14 messages🔥):

NVIDIA Jetson Nano Super, JetPack 6.1 Installation, LLM Inference on AGX, Raspberry Pi 5 for LLMs, Esp32 / Xtensa LX7 Chips

Links mentioned:


GPU MODE ▷ #youtube-recordings (1 messages):

pirate_king97: https://www.youtube.com/playlist?list=PLvJjZoRc4albEFlny8Z1OGDiF_y3MGNK9


GPU MODE ▷ #arc-agi-2 (8 messages🔥):

phi-4 sampling progress, Chain of Thought dataset generation, VLM fine-tuning with Axolotl, Unslosh, and TRL, Custom vision encoder discussion, Test-time scaling for ARC

Links mentioned:


Modular (Mojo 🔥) ▷ #general (25 messages🔥):

Zoom call recording, Usage of MAX for discussions, Mojo on Archcraft Linux issues


Modular (Mojo 🔥) ▷ #announcements (1 messages):

MAX 24.6, MAX GPU, MAX Engine, MAX Serve, Generative AI Infrastructure

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (19 messages🔥):

Mojo v24.6 Release, Python importing Mojo kernels, GPU support in Mojo, Kernel programming API, Mojo documentation updates

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

LlamaReport preview, Agentic AI SDR, Composio platform

Link mentioned: composio/python/examples/quickstarters at master · ComposioHQ/composio: Composio equip's your AI agents & LLMs with 100+ high-quality integrations via function calling - ComposioHQ/composio


LlamaIndex ▷ #general (20 messages🔥):

NVIDIA NV-Embed-v2 availability, Using Qdrant vector store, OpenAI LLM double retries

Links mentioned:


LlamaIndex ▷ #ai-discussion (2 messages):

Intent Recognition Techniques, Handling SSL Certification Errors


Nomic.ai (GPT4All) ▷ #announcements (1 messages):

GPT4All v3.5.3 Release, LocalDocs Fix, Contributors to GPT4All


Nomic.ai (GPT4All) ▷ #general (17 messages🔥):

AI Agent functionality, Jinja template issues, API documentation inquiries, Document processing efficiency, Model performance concerns

Links mentioned:


OpenInterpreter ▷ #general (12 messages🔥):

Gemini 2.0 Flash functionality, VEO 2 vs SORA comparison, OpenInterpreter web assembly integration, Local OS usage, Error handling with Open Interpreter


Torchtune ▷ #general (1 messages):

Torcheval metric sync, Batch processing


Torchtune ▷ #dev (3 messages):

Instruction Fine-Tuning Loss Calculation, Gradient Normalization in Sequence Processing, FSDPModule Adjustments


Torchtune ▷ #papers (7 messages):

GenRM Verifier Model, Sakana AI Memory Optimization, 8B Verifier Performance Analysis, Chain-of-Thought Dataset Generation

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (1 messages):

Hackathon Submission Deadline, Google Form Submission Process, Project Innovation


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (8 messages🔥):

LLM Agents MOOC website updates, Certificate submission deadlines

Link mentioned: Large Language Model Agents MOOC: no description found


tinygrad (George Hotz) ▷ #general (6 messages):

GPU connectivity via USB, Mac support for arm64 backend, Continuous Integration on Macs

Link mentioned: Tweet from the tiny corp (@tinygrad): Err, you sure you can just plug a GPU into a USB port?


Axolotl AI ▷ #general (6 messages):

Scaling Test Time Compute, Performance of 3b Model vs 70b Model, Missing Optim Code in Repo

Link mentioned: Scaling test-time compute - a Hugging Face Space by HuggingFaceH4: no description found


DSPy ▷ #papers (2 messages):

Impact of Autonomous AI, AI Agents in the Knowledge Economy, Displacement of Workers, AI's Role in Hierarchical Firms

Link mentioned: Artificial Intelligence in the Knowledge Economy: The rise of Artificial Intelligence (AI) has the potential to fundamentally reshape the knowledge economy by solving problems at scale. This paper introduces a framework to study this transformation, ...


Mozilla AI ▷ #announcements (2 messages):

Retrieval Augmented Generation (RAG) application, Developer Hub and Blueprints announcement


MLOps @Chipro ▷ #events (1 messages):

Data Infrastructure Innovations, Data Governance, Data Streaming, Stream Processing, AI in Data Infrastructure

Link mentioned: Year-End Retrospective on Data Infra, Wed, Dec 18, 2024, 9:00 AM | Meetup: AboutThe year 2024 was nothing short of groundbreaking for data infrastructure. We witnessed an exciting flurry of innovations, many driving the ongoing push to make


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

BFCL Leaderboard V3, Function calling capabilities, Model response loading issues

Link mentioned: Berkeley Function Calling Leaderboard V3 (aka Berkeley Tool Calling Leaderboard V3) : no description found




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}