Frozen AI News archive

not much happened today

The **Qwen team** launched **QVQ**, a vision-enabled version of their experimental **QwQ o1 clone**, benchmarking comparably to **Claude 3.5 Sonnet**. Discussions include **Bret Taylor's** insights on autonomous software development distinct from the Copilot era. The **Latent Space LIVE!** talks cover highlights of **2024 AI startups, vision, open models, post-transformers, synthetic data, smol models, and agents**. Twitter recaps by **Claude 3.5 Sonnet** highlight proposals for benchmarks measuring LLM calibration and falsehood confidence, with **QVQ** outperforming **GPT-4o** and **Claude Sonnet 3.5**. AI alignment debates focus on intentionality and critiques of alignment faking in models like **Claude**. Updates from **OpenAI** include new **o3 and o3-mini models** and a deliberative alignment strategy. The **ASAL project** is a collaboration between **MIT**, **OpenAI**, and **Swiss AI Lab IDSIA** to automate artificial life discovery. Personal stories reveal frustrations with **USCIS** green card denials despite high qualifications. New tools like **GeminiCoder** enable rapid app creation, and a **contract review agent** using **Reflex** and **Llama Index** checks GDPR compliance. Holiday greetings and memes were also shared.

Canonical issue URL

AI News for 12/23/2024-12/24/2024. We checked 7 subreddits, 433 Twitters and 32 Discords (215 channels, and 2265 messages) for you. Estimated reading time saved (at 200wpm): 257 minutes. You can now tag @smol_ai for AINews discussions!

The Qwen team launched a vision version of their experimental QwQ o1 clone, called QVQ, but the benchmarks mostly bring it up to par with Claude 3.5 Sonnet, and there's also some discussion about Bret Taylor's latest post on autonomous software dev (as distinct from the Copilot era.

The individual talks from Latent Space LIVE! are being released to tide you through the holidays and recap the Best of 2024 in AI Startups, Vision, Open Models, Post-Transformers, Synthetic Data, Smol Models, Agents, and more.

image.png


Your Ad here!

We briefly closed doors for Dec, but are once again reopening ad slots for Jan 2025 AINews. Please email [email protected] to get in front of >30k AI Engineers daily!


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Models and Benchmarking

AI Alignment and Ethics

Company News and Collaborations

Immigration and Personal Discussions

Technical Tools and Projects

Memes/Humor and Holiday Greetings


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Qwen/QVQ-72B Achieves 70.3 on MMMU Evaluation

Theme 2. Inter-3B Model Comparisons: Llama vs Granite vs Hermes

Theme 3. GGUF Models Now Usable Privately via Hugging Face in Ollama

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. Criticism of "Gotcha" tests to determine LLM intelligence

Theme 2. 76K robodogs now $1600, and AI is practically free


AI Discord Recap

A summary of Summaries of Summaries by o1-2024-12-17

Theme 1. Next-Gen Model Rivalries

Theme 2. Tools & Dev Integrations

Theme 3. GPU & Speed Scenes

Theme 4. AI in Real-World Applications

Theme 5. RL, Summaries, and Fine-Tuning Hustle


PART 1: High level Discord summaries

Codeium (Windsurf) Discord


Cursor IDE Discord


Nous Research AI Discord


Unsloth AI (Daniel Han) Discord


Stability.ai (Stable Diffusion) Discord


Stackblitz (Bolt.new) Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


aider (Paul Gauthier) Discord


OpenAI Discord


Modular (Mojo 🔥) Discord


Notebook LM Discord Discord


Latent Space Discord


Interconnects (Nathan Lambert) Discord


LM Studio Discord


GPU MODE Discord


Eleuther Discord


LlamaIndex Discord


Nomic.ai (GPT4All) Discord


LLM Agents (Berkeley MOOC) Discord


tinygrad (George Hotz) Discord


Cohere Discord


OpenInterpreter Discord


DSPy Discord


MLOps @Chipro Discord


LAION Discord


The Axolotl AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Torchtune Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Codeium (Windsurf) ▷ #discussion (54 messages🔥):

Codeium Plugin Issues, Christmas Support Response, Upgrade Pro Status Problem, Dashboard Connection Errors, General User Questions

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (440 messages🔥🔥🔥):

Windsurf Pricing, Model Performance Comparison, Windsurf Support Issues, AI Model Utilization, User Experiences with Different Models

Links mentioned:


Cursor IDE ▷ #general (406 messages🔥🔥🔥):

Cursor IDE Updates, Performance Comparisons of AI Models, AppImage Installation on Linux, Local API & Database Setup with Cursor, Community Experiences and Advice

Links mentioned:


Nous Research AI ▷ #general (207 messages🔥🔥):

Phi-4 Model Performance, Qwen Coder Models, Hugging Face Contributions, Unpaid Internships in Startups, Quantization Methods for LLMs

Links mentioned:


Nous Research AI ▷ #ask-about-llms (1 messages):

renegado0000: <@&1214801236323467284>


Nous Research AI ▷ #interesting-links (1 messages):

carsonpoole: https://freeaibooksummaries.com


Unsloth AI (Daniel Han) ▷ #general (125 messages🔥🔥):

Quantization and Model Conversion, Using Unsloth for Fine-tuning, Efficiency and VRAM Usage of Unsloth, Introduction of QVQ Model, User Feedback on Unsloth

Links mentioned:

  PyTorch

: no description foundCharlie Day GIF - Charlie Day - Discover & Share GIFs: Click to view the GIFHugging Face – The AI community building the future.: no description foundhuihui-ai/Llama-3.2-11B-Vision-Instruct-abliterated · Hugging Face: no description foundUncensor any LLM with abliteration: no description foundMake Error, fatal error: Python.h: No such file or directory compilation terminated. · Issue #1038 · CMU-Perceptual-Computing-Lab/openpose: In file included from /home/sclab/Downloads/openpose/3rdparty/pybind11/include/pybind11/pytypes.h:12:0, from /home/sclab/Downloads/openpose/3rdparty/pybind11/include/pybind11/cast.h:13, from /home/...


Unsloth AI (Daniel Han) ▷ #off-topic (5 messages):

fine-tuning llama 3.2:3B, QLoRa method, sprint mode


Unsloth AI (Daniel Han) ▷ #help (11 messages🔥):

Unsloth vs Ollama, Fine-tuning multimodal LLMs, Translation evaluation support, Issues with model saving, Inference speed and memory


Unsloth AI (Daniel Han) ▷ #showcase (7 messages):

Unsloth Pro Availability, Multi-GPU Testing, Contacting Support


Unsloth AI (Daniel Han) ▷ #research (6 messages):

Open Source Embedding Models, Mixed Bread Embedding

Link mentioned: mixedbread-ai/mxbai-embed-large-v1 · Hugging Face: no description found


Stability.ai (Stable Diffusion) ▷ #general-chat (139 messages🔥🔥):

Scammers in Discord servers, Renting GPUs, Inpainting with AI, Video generation models, Using Stable Diffusion in offline mode

Links mentioned:


Stackblitz (Bolt.new) ▷ #prompting (7 messages):

AI Efficiency Studies, Data Learning Patterns, Project Access Limitations

Link mentioned: TikTok - Make Your Day: no description found


Stackblitz (Bolt.new) ▷ #discussions (128 messages🔥🔥):

MongoDB connection issues, Bolt project experiences, Team pricing structure, Mobile usability of Bolt, Using AI tools for development

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Web Search for LLMs, Price cuts for various models, New Endpoints API

Link mentioned: Tweet from OpenRouter (@OpenRouterAI): Holiday 🎁 experiment: Web Search, but for any LLM!Here's Sonnet with & without grounding:


OpenRouter (Alex Atallah) ▷ #general (128 messages🔥🔥):

SambaNova Model Parameters, Tier 5 API Key Requests, Web Search Feature in Chat, Qwen Model Performances, Claude 3.5 Comparison

Links mentioned:


Perplexity AI ▷ #general (84 messages🔥🔥):

Perplexity performance concerns, AGI development discussions, Subscription issues and cancellations, Upcoming AI model expectations, Community holiday wishes

Links mentioned:


Perplexity AI ▷ #sharing (10 messages🔥):

O3 Model Debut by OpenAI, FDA's New Healthy Food Label, NASA touches the sun, Apple's nearing $4T valuation, LLMAAS discussions

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (2 messages):

Credit card management, Llama 3


aider (Paul Gauthier) ▷ #general (64 messages🔥🔥):

Aider usage and features, Real-time voice interaction UI, Qwen and QVQ models, Benchmarking AI models, Holidays and community engagement

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (13 messages🔥):

.gitignore functionality, Use of CONVENTIONS.md, Specifying architect model in YAML, Aider's /help commands, Cursor IDE features

Links mentioned:


aider (Paul Gauthier) ▷ #links (1 messages):

epicureus: https://bigcode-bench.github.io


OpenAI ▷ #ai-discussions (63 messages🔥🔥):

Meta's Concept Models, AI Self-Improvement, AI Coaches, Programming with AI, Claude vs. ChatGPT


OpenAI ▷ #gpt-4-discussions (4 messages):

O1 model capabilities, Setting up ESLint, Config file handling


OpenAI ▷ #prompt-engineering (4 messages):

Memory in Personalization, Recipe Generation, Protein Calculation Issues


OpenAI ▷ #api-discussions (4 messages):

Memory in Personalization, Complexity in Recipe Creation, GT-4o Protein Calculation Issues


Modular (Mojo 🔥) ▷ #general (11 messages🔥):

Standard Library Bug Fix, High-Frequency Trading with Mojo, Mojo Networking Limitations, Algorithmic Trading Insights


Modular (Mojo 🔥) ▷ #mojo (21 messages🔥):

Mojo GPU support, CPU performance benchmarks, Standard library bug fix, Mojo vs Julia for scientific computing

Links mentioned:


Modular (Mojo 🔥) ▷ #max (27 messages🔥):

Mojo Kernels, JAX Compilation Times, Mandelbrot Implementation in MAX, Comparison of MAX and JAX, Python Graph Construction for Mojo

Link mentioned: Issues · modularml/max: A collection of sample programs, notebooks, and tools which highlight the power of the MAX Platform - Issues · modularml/max


Notebook LM Discord ▷ #use-cases (9 messages🔥):

Akas App for Podcast Sharing, Using NotebookLM for Book Series, RSS Feed Discussion, Podcast Generation with Google News

Link mentioned: Akas: Voice to your personal thoughts: Akas is the ultimate platform for sharing AI-generated podcasts and your own voice. With more and more podcasts being created by AI, like those from NotebookLM and other platforms, Akas provides a sea...


Notebook LM Discord ▷ #general (41 messages🔥):

NotebookLM Bug Reports, AI Generated Podcasts Sharing, Google Project Mariner, User Interface Feedback, Annual Review with LLM Tools

Links mentioned:


Latent Space ▷ #ai-general-chat (40 messages🔥):

Large Concept Models, OCTAVE Speech-Language Model, xAI's Series C Funding, AI Engineer Summit, Autonomous Development

Links mentioned:


Latent Space ▷ #ai-announcements (2 messages):

Post-Transformers, Synthetic Data, Smol Models, Long Context vs RAG, Model Collapse

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (6 messages):

QvQ 72B Model Release, Holiday Shipping Strategy, Cultural Perspectives on Christmas

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (12 messages🔥):

O1/O3 trajectory generation, RM-guided decoding vs. RM-weighted decoding, Majority voting in LLMs, Length control in reasoning streams, Meta paper on self-awareness in language models


Interconnects (Nathan Lambert) ▷ #nlp (5 messages):

LM Reasoning Papers, Chain-of-Thought Research, Self-Training Techniques, Decoding Approaches


Interconnects (Nathan Lambert) ▷ #rl (2 messages):

OpenAI O3 Speculation, Recruitment Opportunities, Reinforcement Learning Techniques, Chain-of-Thought Generation

Link mentioned: Tweet from leloy! (@leloykun): we're all probably overthinking what OpenAI did with O3here's my best guess, stitched together from tweets from OpenAI employees, their blog posts, and rumors:1. They collected a ton of proces...


Interconnects (Nathan Lambert) ▷ #cv (4 messages):

QVQ Visual Reasoning Model, Product Rule in Calculus, Llava-Critic for Reasoning, LLM as Judge in Exploration, Model Exploration Techniques

Link mentioned: QVQ: To See the World with Wisdom: GITHUB HUGGING FACE MODELSCOPE KAGGLE DEMO DISCORDLanguage and vision intertwine in the human mind, shaping how we perceive and understand the world around us. Our ability to reason is deeply rooted i...


Interconnects (Nathan Lambert) ▷ #reads (11 messages🔥):

Dylan's Rant on AMD Software, Subscription for Information, Investment Tips on Nvidia

Links mentioned:


Interconnects (Nathan Lambert) ▷ #lectures-and-projects (2 messages):

Tay-loo, YouTube link


LM Studio ▷ #general (24 messages🔥):

LM Studio CPU compatibility, Loading models in LM Studio, Granite models performance, Merry Christmas wishes, OpenAI API access from Russia


LM Studio ▷ #hardware-discussion (15 messages🔥):

CPU inference with EPYC processors, PCIe risers, VRAM utilization with multiple GPUs, Text2Video in ComfyUI, Model performance comparisons


GPU MODE ▷ #general (4 messages):

Symbolic Integers in PyTorch, Float Handling in PyTorch, Torch Compilation Behavior

Link mentioned: torch — PyTorch 2.5 documentation: no description found


GPU MODE ▷ #triton (8 messages🔥):

Type Hints in Triton, Async Operations and Warp Specialization, Building Triton from Source, boundary_check Usage, Recent Changes in Triton Functions


GPU MODE ▷ #cuda (4 messages):

Infrastructure choices, AWS Pricing, Bare Metal Solutions, Single Warp


GPU MODE ▷ #torch (12 messages🔥):

PyTorch profiler visualizations, CUDA memory usage debugging, GPU benchmarking methods in Triton and Torch, JAX equivalents for benchmarking, Kernel timing in GPU operations

Links mentioned:


GPU MODE ▷ #triton-puzzles (1 messages):

pycario installation, Python.h error, Bash and Fish shell export, Searching for Python.h path


GPU MODE ▷ #bitnet (1 messages):

BitNet Training with Ternary Weights, Noise Step Paper, Efficient Model Formats

Links mentioned:


GPU MODE ▷ #arc-agi-2 (6 messages):

OREO Method for Offline Reinforcement Learning, Fine-tuning LLMs with RL, GitHub Resources for RL, Hugging Face TRL Library, Collaboration on AR-AGI-2 Projects

Links mentioned:


Eleuther ▷ #general (3 messages):

Pythia Model Pretraining, AI Hallucinations


Eleuther ▷ #research (16 messages🔥):

Automated Search for Artificial Life, LLMs with Offline Coprocessor, Linear Attention in Diffusion Transformers

Links mentioned:


LlamaIndex ▷ #blog (3 messages):

Document agent for SKU matching, Contract review agent with GDPR compliance


LlamaIndex ▷ #general (11 messages🔥):

Ollama LLM Context Window Issue, VectorIndexRetriever Serialization Problem, Chroma Vector DB Usage, Recursive Retriever Implementation, Message Batching API Access

Link mentioned: [Question]: TypeError: Object of type VectorIndexRetriever is not JSON serializable in Structured Hierarchical Retrieval · Issue #11478 · run-llama/llama_index: Question Validation I have searched both the documentation and discord for an answer. Question I have been playing around llama-index and try to learn rag implementations using it. The structured h...


Nomic.ai (GPT4All) ▷ #general (13 messages🔥):

Azure AI cost-effectiveness, Vision model functionality, Using o1 in GPT4All, Proxying GPT4All to Ollama, LocalFiles document querying

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (12 messages🔥):

Certificate Declaration Form Confirmation, Quizzes and Assignments Feedback, Upcoming MOOC Sessions, Article Review Confirmation


tinygrad (George Hotz) ▷ #general (4 messages):

AMD software situation, SemiAnalysis critiques, Monopoly concerns, Lean proof bounty


tinygrad (George Hotz) ▷ #learn-tinygrad (6 messages):

Discord Rules, Torch to Tinygrad Conversion, Python Library Usability


Cohere ▷ #discussions (5 messages):

Christmas Cheer, Snoopy as Santa, X-mas Planning

Link mentioned: Its Christmas Eve GIF - Christmas Eve Snoopy Santa Claus - Discover & Share GIFs: Click to view the GIF


Cohere ▷ #cmd-r-bot (1 messages):

donny_52_61107: GM


OpenInterpreter ▷ #general (4 messages):

User Frustration, Hours Spent on Computer, Technical Issues


OpenInterpreter ▷ #ai-content (1 messages):

singular5547: https://computer.tldraw.com/


DSPy ▷ #show-and-tell (1 messages):

pyn8n v4, Dynamic Workflow Generation, Conversational CLI, Ash Framework Integration, n8n API Wrapper

Link mentioned: pyn8n: N8N client and AI tools.


MLOps @Chipro ▷ #general-ml (1 messages):

breezy.badger: pretty cool thanks for sharing!


LAION ▷ #general (1 messages):

GPT-4o Image Generation

Link mentioned: Tweet from Greg Brockman (@gdb): A GPT-4o generated image — so much to explore with GPT-4o's image generation capabilities alone. Team is working hard to bring those to the world.







{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}