Frozen AI News archive

Meta Llama 3.3: 405B/Nova Pro performance at 70B price

**Meta AI** released **Llama 3.3 70B**, matching the performance of the 405B model with improved efficiency using *"a new alignment process and progress in online RL techniques"*. **OpenAI** announced **Reinforcement Fine-Tuning (RFT)** for building expert models with limited data, offering alpha access to researchers and enterprises. **Google DeepMind's Gemini-Exp-1206** leads benchmarks, tying with **GPT-4o** in coding performance. **LlamaCloud** enhanced document processing with table extraction and analytics. Discussions on **OpenAI's** pricing plans continue in the community.

Canonical issue URL

AI News for 12/5/2024-12/6/2024. We checked 7 subreddits, 433 Twitters and 31 Discords (206 channels, and 5628 messages) for you. Estimated reading time saved (at 200wpm): 535 minutes. You can now tag @smol_ai for AINews discussions!

Meta AI, sensibly waiting for OpenAI to release an o1 finetuning waitlist, thankfully kept their sane versioning strategy and simply bumped their Llama minor version yet again to 3.3, this time matching 405B performance with their 70B model, using "a new alignment process and progress in online RL techniques". No papers of course.

image.png

Amazon Nova Pro had all of 3 days to sit and look pretty, but with Meta loudly advertising same performance at 12% of the cost, they have been smacked back down in the hierarchy of price-to-performance ratios.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

Here are the key themes and discussions from the Twitter activity, organized by major topics:

Meta's Llama 3.3 70B Release

OpenAI's Reinforcement Fine-Tuning (RFT) Announcement

Google's Gemini Performance Updates

LlamaCloud & Document Processing

Memes and Industry Commentary


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Llama 3.3 70B Performance vs. GPT-4o and Others

Theme 2. Open Source O1: Call for Better Models

Theme 3. Windsurf Cascade System Prompt Details

Theme 4. HuggingFace Course: Preference Alignment for LLMs

Theme 5. Adobe Releases DynaSaur Code for Self-Coding AI

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

Theme 1. OpenAI GPT-4.5: Surpassing Expectations in Creative Language Tasks


AI Discord Recap

A summary of Summaries of Summaries by O1-mini

Theme 1. AI Model Releases and Performance Battle Royale

Theme 2. Pricing Shakeups Spark User Grievances

Theme 3. Tool Stability Fails and User Frustrations

Theme 4. Feature Enhancements and New Integrations Unveiled

Theme 5. Community Concerns: Security, Licensing, and Fake Apps


PART 1: High level Discord summaries

Codeium / Windsurf Discord


Notebook LM Discord Discord


Unsloth AI (Daniel Han) Discord


Cursor IDE Discord


OpenRouter (Alex Atallah) Discord


Eleuther Discord


aider (Paul Gauthier) Discord


Interconnects (Nathan Lambert) Discord


Bolt.new / Stackblitz Discord


Stability.ai (Stable Diffusion) Discord


OpenAI Discord


Modular (Mojo 🔥) Discord


Perplexity AI Discord


LM Studio Discord


Cohere Discord


Latent Space Discord


Nous Research AI Discord


GPU MODE Discord


Torchtune Discord


LlamaIndex Discord


OpenInterpreter Discord


LLM Agents (Berkeley MOOC) Discord


Axolotl AI Discord


DSPy Discord


tinygrad (George Hotz) Discord


LAION Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Codeium / Windsurf ▷ #announcements (2 messages):

Cascade pricing changes, Dedicated ticketing system for support

Links mentioned:


Codeium / Windsurf ▷ #discussion (456 messages🔥🔥🔥):

Windsurf pricing changes, User frustrations with AI tools, Alternatives to Windsurf, Impact of server issues on user experience, Feedback on AI tool performance

Links mentioned:


Codeium / Windsurf ▷ #windsurf (751 messages🔥🔥🔥):

Windsurf Pricing Changes, User Reactions to New Limits, Comparison with Other AI Tools, Grandfathering for Existing Users, Performance of AI Models

Links mentioned:


Notebook LM Discord ▷ #use-cases (212 messages🔥🔥):

Audio Generation, NotebookLM Use Cases, Language Support, Game Development, Text-to-Speech Technology

Links mentioned:


Notebook LM Discord ▷ #general (94 messages🔥🔥):

NotebookLM PDF handling, Podcast generation limits, Language setting issues, Notebook creation button, General performance and usability feedback

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (217 messages🔥🔥):

PaliGemma 2 Release, Qwen Model Fine-Tuning Issues, Unsloth Pro Updates, Llama 3.3 Release, Memory Issues with QLORA

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (4 messages):

Google Summer of Code 2025, Editing messages in Discord, Latex formatting


Unsloth AI (Daniel Han) ▷ #help (42 messages🔥):

Fine-tuning vs RAG, Conversational AI Design, Training Time Estimates, LoRA Fine-Tuning for Models, Multi-GPU Training Support


Cursor IDE ▷ #general (250 messages🔥🔥):

Cursor performance issues, Comparison with Windsurf, Updates in Cursor 0.43.6, User experiences with Composer, Unit testing with Cursor

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (3 messages):

Author Pages feature, New Amazon Nova models, DeepInfra price drops, Launch of Llama 3.3, Text-based use cases

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (235 messages🔥🔥):

Amazon Nova Models, OpenAI Updates, Llama 3.3 Launch, Anthropic Model Expectations, InternVL Models

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (5 messages):

Custom Beta Keys, Integration Beta Feature


Eleuther ▷ #general (26 messages🔥):

Meetup in San Francisco, OpenAI API terminology, Introduction of new members, Collaboration on solving literary puzzles, Discussion on model performance

Links mentioned:


Eleuther ▷ #research (183 messages🔥🔥):

MoE-lite motif, Goldfinch architecture, Layerwise token value embeddings, KV cache optimization, Dynamic weight adjustments

Links mentioned:


Eleuther ▷ #interpretability-general (8 messages🔥):

Updated Mechanistic Interpretability Resources, Community Feedback on Neuronpedia and SAELens, Neel's Annotated Paper List, Outdated Mechanistic Interpretation Materials

Links mentioned:


Eleuther ▷ #gpt-neox-dev (1 messages):

karatsubabutslower: CC <@367104793292046338> Any hints for this?


aider (Paul Gauthier) ▷ #announcements (1 messages):

Aider v0.67.0, Amazon Bedrock Nova models, Command enhancements, Process suspension support, Exception capture analytics


aider (Paul Gauthier) ▷ #general (148 messages🔥🔥):

Aider Pro Features, New AI Models Benchmarking, Gemini 1206 Release, DeepSeek Performance, User Expectations for APIs

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (46 messages🔥):

Feeding Documentation to Aider, Setting Up API Key for Gemini, Using GCP VertexAI, Aider Caching Issues, Aider Test Command Bug


Interconnects (Nathan Lambert) ▷ #events (5 messages):

Networking opportunities for Engineers, Interconnects merchandise


Interconnects (Nathan Lambert) ▷ #news (144 messages🔥🔥):

Gemini-exp-1206, Llama 3.3, Qwen2-VL-72B, Reinforcement Fine-Tuning, AI2 All Hands

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (28 messages🔥):

AI2 Demos, o1 Usage, Codeium Pricing, OpenAI's o1 Access Limits, Tulu in Chatbotarena

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (18 messages🔥):

OpenAI o1 model regression, Competition among AI models, Meta's silence on AI developments, Performance of Deepseek and Qwen, Challenges with LLM reasoning

Links mentioned:


Bolt.new / Stackblitz ▷ #prompting (17 messages🔥):

Feature Requests Management, Token Savings on Edits, Web Container Development, Community Assistance, Motivation in Projects

Link mentioned: Bolters.IO | Community Supported knowledge base: no description found


Bolt.new / Stackblitz ▷ #discussions (166 messages🔥🔥):

GitHub Repo Integration, Local Storage vs Backend Integration, Token Management, Feature Requests and Improvements, Open Source Bolt Enhancements

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (182 messages🔥🔥):

Reactor for Face Swap, Discord for AI Discussions, Cloud GPU Providers, Using Lora and ControlNet, Stable Diffusion Models for Realism

Links mentioned:


OpenAI ▷ #annnouncements (1 messages):

Reinforcement Fine-Tuning, 12 Days of OpenAI

Link mentioned: 12 Days of OpenAI: Day 2: Begins at 10am PTJoin Mark Chen, SVP of OpenAI Research, Justin Reese, Computational Researcher in Environmental Genomics and Systems Biology, Berkeley Lab, ...


OpenAI ▷ #ai-discussions (116 messages🔥🔥):

O1 Expectations, Gemini Experimental Model, Advanced Voice Mode, ChatGPT-4o Performance, Pricing and Value Discussion


OpenAI ▷ #gpt-4-discussions (13 messages🔥):

GPT Editing Collaboration, ChatGPT App Integrations, Custom GPT Deletion Impacts


OpenAI ▷ #prompt-engineering (11 messages🔥):

Self-correcting models, Using OCR for financial data, Challenges with LLMs in data extraction, Open source OCR libraries, Improving PDF workflows


OpenAI ▷ #api-discussions (11 messages🔥):

Self-Correcting Models, Financial Data Extraction Techniques, OCR Libraries for PDFs, Agentic Frameworks, Integrating Data Sources


Modular (Mojo 🔥) ▷ #general (1 messages):

VSCode Extension Issues, Test Configuration


Modular (Mojo 🔥) ▷ #mojo (147 messages🔥🔥):

Mojo Syntax and Functionality, Learning Paths for Programming, Compiler Design and Metaprogramming, Blockchain and Programming Languages, Education Experiences in Computer Science

Links mentioned:


Perplexity AI ▷ #general (89 messages🔥🔥):

Perplexity AI's code interpreter, Fake Perplexity app on Windows, Llama 3.3 model update, Grok and Groq for API usage, OpenAI API integration concerns

Links mentioned:


Perplexity AI ▷ #sharing (8 messages🔥):

Writing Prompts, Web Design, Oldest Alphabetic Writing, Meaning Exploration, Longevity Research


Perplexity AI ▷ #pplx-api (2 messages):

RAG feature in Perplexity API, Perplexity Trends App


LM Studio ▷ #general (63 messages🔥🔥):

LM Studio Uninstall Behavior, Paligemma 2 Release, RAG File Limitations, RAM Upgrade Discussion, LM Studio Compatibility with Whisper Models

Link mentioned: Tweet from Prince Canuma (@Prince_Canuma): mlx-vlm v0.1.4 is here 🎉New models:- @GoogleDeepMind Paligemma 2Up next 🚧:- Refactoring Get started:> pip install -U mlx-vlm Please leave us a star and send a PR :)


LM Studio ▷ #hardware-discussion (8 messages🔥):

GPU Control in Apps, Benchmarks for Llama 3.1, 4090 Pricing Surge, Chinese Modding for 4090

Link mentioned: How do I select which GPU to run a job on?: In a multi-GPU computer, how do I designate which GPU a CUDA job should run on? As an example, when installing CUDA, I opted to install the NVIDIA_CUDA-&lt;#.#&gt;_Samples then ran ...


Cohere ▷ #discussions (28 messages🔥):

Rerank 3.5 Model, AI Cost Concerns, Reinforcement Fine Tuning

Link mentioned: Introducing Rerank 3.5: Precise AI Search: Rerank 3.5 delivers improved reasoning and multilingual capabilities to search complex enterprise data with greater accuracy. 


Cohere ▷ #announcements (1 messages):

Structured Outputs for Tool Use, Command models, Chat API V2 compatibility

Link mentioned: Structured Outputs — Cohere: This page describes how to get Cohere models to create outputs in a certain format, such as JSON.


Cohere ▷ #questions (7 messages):

Connector Access without Public URL, Recent Updates on Command R Model, Cohere IP Allowlisting, Document Error in Cohere API, Specifying Multilingual in Fine-Tuning


Cohere ▷ #api-discussions (32 messages🔥):

Cohere vs OpenAI, Rate Limit Concerns, Image Embedding Errors, Support Experience, Retry Mechanism for API Calls


Cohere ▷ #projects (1 messages):

vnc-lm, LiteLLM integration, API connections, Threaded conversations, Model switching feature

Link mentioned: GitHub - jake83741/vnc-lm: Message with Claude 3.5 Sonnet, Llama 3.3, GPT-4o, and other LLMs through Discord.: Message with Claude 3.5 Sonnet, Llama 3.3, GPT-4o, and other LLMs through Discord. - jake83741/vnc-lm


Cohere ▷ #cohere-toolkit (2 messages):

Introduction, Community Welcome


Latent Space ▷ #ai-general-chat (65 messages🔥🔥):

Writer's Built-in RAG Tool, ShellSage Project, Reinforcement Fine-Tuning API, Gemini Exp 1206 Update, AI Essays and Industry Insights

Links mentioned:


Latent Space ▷ #ai-in-action-club (1 messages):

kbal11: AI in Action


Nous Research AI ▷ #general (45 messages🔥):

Nous Distro, Llama 3.3 Model Release, Evaluation Metrics on Models, Continuous Learning Experiments, Safety Concerns in AI Outputs

Link mentioned: Tweet from Ahmad Al-Dahle (@Ahmad_Al_Dahle): Introducing Llama 3.3 – a new 70B model that delivers the performance of our 405B model but is easier & more cost-efficient to run. By leveraging the latest advancements in post-training techniques in...


Nous Research AI ▷ #ask-about-llms (18 messages🔥):

Chronic Kidney Disease Detection, Fine-Tuning Mistral Models, Using Unsloth for Classification, Data Formatting for Model Training, LightGBM for Tabular Data

Links mentioned:


GPU MODE ▷ #general (9 messages🔥):

Popcorn Project, Timeline for Launch, Benchmarking GPUs, FP8 vs INT8 Performance


GPU MODE ▷ #triton (2 messages):

Nvidia Nsight, Triton release plans, TMA descriptors, Nightly builds issues


GPU MODE ▷ #cuda (6 messages):

SASS code extraction, nvdisasm utility, ncu tool features, Compiler Explorer


GPU MODE ▷ #cool-links (1 messages):

mobicham: https://x.com/Ahmad_Al_Dahle/status/1865071436630778109 Llama 3.3 is out


GPU MODE ▷ #beginner (7 messages):

CUDA kernel compilation, Optimizing Pybind usage, Ninja build system, Using raw types with CUDA


GPU MODE ▷ #pmpp-book (1 messages):

Lecture 37 on SASS, YouTube clips, Triton and CUDA

Link mentioned: Lecture 37: Introduction to SASS & GPU Microarchitecture: Speaker: Arun DemeureSlides: https://github.com/gpu-mode/lectures/tree/main/lecture_037


GPU MODE ▷ #torchao (1 messages):

Quantization in TorchAO, Implementation Details, Recommended Files for Starting


GPU MODE ▷ #off-topic (6 messages):

Meta intern team matching, Ultralytics package compromise, Discord thread visibility timing

Link mentioned: Discrepancy between what's in GitHub and what's been published to PyPI for v8.3.41 · Issue #18027 · ultralytics/ultralytics: Bug Code in the published wheel 8.3.41 is not what's in GitHub and appears to invoke mining. Users of ultralytics who install 8.3.41 will unknowingly execute an xmrig miner. Examining the file uti...


GPU MODE ▷ #triton-puzzles (2 messages):

MID clarification, Tensor shapes


GPU MODE ▷ #self-promotion (2 messages):

LTX Video Model Implementation, Performance on RTX 4060 and RTX 4090

Links mentioned:


GPU MODE ▷ #🍿 (6 messages):

Security concerns in competitions, Common attack vectors, Impact of trolling in niche communities


Torchtune ▷ #announcements (1 messages):

Llama 3.3 release, Torchtune finetuning support

Link mentioned: torchtune/recipes/configs/llama3_3 at main · pytorch/torchtune: PyTorch native finetuning library. Contribute to pytorch/torchtune development by creating an account on GitHub.


Torchtune ▷ #general (19 messages🔥):

LoRA training changes, Alpaca training defaults, European access to the platform

Links mentioned:


Torchtune ▷ #papers (1 messages):

Crypto Lottery, LLM Agreements


LlamaIndex ▷ #blog (3 messages):

LlamaParse, Hybrid Search with MongoDB, Multimodal Parsing


LlamaIndex ▷ #general (10 messages🔥):

WorkflowTimeoutError, Using ReAct agent, Tool description length limitation, Accessing output JSON in Python

Links mentioned:


OpenInterpreter ▷ #general (6 messages):

1.0 preview performance, Access to the app, MacOS availability, Supported models for interpreter tool


OpenInterpreter ▷ #O1 (5 messages):

API availability, Reinforcement fine tuning, Upcoming AI features


OpenInterpreter ▷ #ai-content (2 messages):

Reinforcement Fine-Tuning, Llama 3.3 Release

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (5 messages):

Spring Term 2025 MOOC, Grading Lab Assignments, OpenAI Credit Card Issues


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (5 messages):

Lecture Slides Delay, Recordings for Captioning, Course Website Updates


Axolotl AI ▷ #general (10 messages🔥):

Llama 3.3 Release, Model Request Issues, Quality Bounds in SFT vs RL


DSPy ▷ #general (7 messages):

DSPy Module Optimization, RAG System Context Issue


tinygrad (George Hotz) ▷ #general (4 messages):

tinygrad stats, VPS billing, Hetzner infrastructure


LAION ▷ #general (1 messages):

Personification of Cells, Osmosis Jones






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}