Frozen AI News archive

not much happened today

**OpenAI** announced their "12 Days of OpenAI" event with daily livestreams and potential releases including the **O1 full model**, **Sora video model**, and **GPT-4.5**. **Google DeepMind** released the **GenCast weather model** capable of **15-day forecasts in 8 minutes** using TPU chips, and launched **Genie 2**, a model generating playable 3D worlds from single images. Leading vision researchers **Lucas Beyer**, **Alexander Kolesnikov**, and **Xiaohua Zhai** moved from DeepMind to OpenAI, which is opening a Zürich office. Criticism arose over OpenAI's strategy and model quality compared to **Anthropic** and **Claude 3.5 Sonnet**. On Reddit, a modified **llama.cpp** supports **Nvidia's Llama-3_1-Nemotron-51B**, matching performance of larger 70B models via NAS optimization.

Canonical issue URL

AI News for 12/3/2024-12/4/2024. We checked 7 subreddits, 433 Twitters and 29 Discords (198 channels, and 2915 messages) for you. Estimated reading time saved (at 200wpm): 317 minutes. You can now tag @smol_ai for AINews discussions!

Smol.ai update: Smol Talk now has vision! Where previously if it encounters an image, it would hallucinate, now we do the necessary prompting. See today's Reddit Recaps for an example, and now your personalized recaps also get them.

If you are interested in NeurIPS next week, there are 50 more tickets left for our end of year recap event (livestream available, NeurIPS ticket not required). Most speakers have been announced.

Genie 2 has topped HN all day, and we previously covered SIMA, but given that this continues to be (impressive) cherrypickware, we aren't giving it title story status.

o1-full is expected during their new advent calendar, just as they poach a bunch of DeepMind researchers. Perhaps it is true that openai is so back.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

Here are the key themes and discussions from the Twitter data, organized by topic:

OpenAI's "12 Days of Christmas" Launch Announcement

DeepMind's Major Research Releases

High-Profile Talent Moves

Criticism of AI Model Quality

Memes & Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Nemotron-51B Released: Nvidia's NAS Optimized Model Matches 70B Performance

Theme 2. Dynamic 4-bit Quantization: Selective Layer Precision for Better Performance

Theme 3. FishSpeech v1.5: Multilingual Zero-Shot Voice Cloning Breakthrough

Theme 4. ByteDance Intern Drama: ¥8M Lawsuit Winner Gets NeurIPS Best Paper

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

Theme 1. OpenAI '12 Days of Shipmas' to Include Sora and O1 Model Releases

Theme 2. New Open Source AI Video Models: Tencent Hunyuan vs LTX Comparison

Theme 3. OpenAI Reaches 300M Weekly Users, Signs Defense Contract

Theme 4. Claude 3.5 vs ChatGPT: User Migration and Comparison Trends


AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1: Amazon Unveils Nova AI Models, Shakes Up AI Landscape

Theme 2: OpenAI's 12 Days of Announcements Ignite Anticipation

Theme 3: Cursor IDE Outages Push Users Toward Alternatives

Theme 4: NVIDIA's SANA Model Slammed for Draconian License

Theme 5: Pydantic AI Supercharges Development with New Integrations


PART 1: High level Discord summaries

Cursor IDE Discord


Eleuther Discord


OpenAI Discord


aider (Paul Gauthier) Discord


Modular (Mojo 🔥) Discord


Unsloth AI (Daniel Han) Discord


Perplexity AI Discord


OpenRouter (Alex Atallah) Discord


Nous Research AI Discord


Notebook LM Discord Discord


Interconnects (Nathan Lambert) Discord


GPU MODE Discord


Stability.ai (Stable Diffusion) Discord


Latent Space Discord


LM Studio Discord


LlamaIndex Discord


Cohere Discord


DSPy Discord


LLM Agents (Berkeley MOOC) Discord


OpenInterpreter Discord


Torchtune Discord


LAION Discord


tinygrad (George Hotz) Discord


Axolotl AI Discord


Mozilla AI Discord


Gorilla LLM (Berkeley Function Calling) Discord


AI21 Labs (Jamba) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Cursor IDE ▷ #general (476 messages🔥🔥🔥):

Cursor outages, Changes to Cursor features, Windsurf vs. Cursor performance, OpenAI 12 Days of Announcements, Issues with Cursor's performance

Links mentioned:


Eleuther ▷ #general (198 messages🔥🔥):

JAX vs PyTorch Performance, Apple's use of AWS AI chips, Training methods and frameworks, Schedule-free optimizers, Embedding techniques for images with coordinates

Links mentioned:


Eleuther ▷ #research (114 messages🔥🔥):

Gradient Synchronization in Large Models, Performance of Second Order Optimizers, Random Number Generators, Flow Matching vs Diffusion Training, Machine Unlearning Literature

Links mentioned:


Eleuther ▷ #scaling-laws (1 messages):

Scaling Law Codebases, Examples of Scaling Experiments


Eleuther ▷ #interpretability-general (1 messages):

deku7041: https://transformer-circuits.pub/


Eleuther ▷ #lm-thunderdome (7 messages):

External Loadable Evals, lm-eval-harness, Dataset Visibility and Versioning, Reproducibility Concerns

Link mentioned: lm-evaluation-harness/lm_eval/main.py at f49b0377bf559f5558e8cd9ebd1190218c7df2a4 · EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness


Eleuther ▷ #multimodal-general (1 messages):

Mira Virtual AI tools, Multimodal conversions, Consumer-level GPU frameworks

Link mentioned: GitHub - Mirror-Prismals/Mira-Virtual-Ai: Ai Frameworks for Consumer Level GPU's: Ai Frameworks for Consumer Level GPU's. Contribute to Mirror-Prismals/Mira-Virtual-Ai development by creating an account on GitHub.


Eleuther ▷ #gpt-neox-dev (2 messages):

Logging Configuration, Optimizer Performance Metrics


OpenAI ▷ #annnouncements (1 messages):

OpenAI: -# @everyone 12 Days of OpenAI


OpenAI ▷ #ai-discussions (242 messages🔥🔥):

AI Translation Tools, Quantum Computing in Voting, Cohere AI Features, OpenAI File Processing Issues, Hungarian Translation Accuracy

Link mentioned: Mark Johns (@Doomlaser) on Artificial Intelligence, Symbolic Logic, Corporate Fairness & More ∰$❤️🏤.: Follow https://x.com/DoomlaserCORP on Twitter.


OpenAI ▷ #gpt-4-discussions (4 messages):

GPT image reading limitations, LLMs and translation issues, Advanced Voice Mode for Custom GPTs


OpenAI ▷ #prompt-engineering (11 messages🔥):

Improving prompt engineering, Baiting models to think deeper, Using markdown for prompts, Research on GPT response time, Model comparison


OpenAI ▷ #api-discussions (11 messages🔥):

Prompt Engineering, Baiting for Deeper Responses, YAML Prompt Structuring, Model Thinking Time, API Automation Test Cases


aider (Paul Gauthier) ▷ #general (175 messages🔥🔥):

Amazon Bedrock Models, Aider New Features, QwQ Model Performance, User Experience with Aider, Benchmark Results

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (67 messages🔥🔥):

Aider Docker Setup, Timeout Issues with Aider, Using Aider with Local Models, Using MCP with Aider, Function Refactoring with Aider

Links mentioned:


aider (Paul Gauthier) ▷ #links (3 messages):

MCP adoption, OpenAI's development strategy


Modular (Mojo 🔥) ▷ #general (119 messages🔥🔥):

Mojo Networking Features, SIMD in Mojo, High-Performance File Server, Extensible Sockets Development, Async Programming Considerations

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (112 messages🔥🔥):

Inline Reference Concept, Memory Optimization Techniques, Compiler Support for Reference Traits, Bounds Checking for Mojo Lists, Auto-tuning in Compilation Phases

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (124 messages🔥🔥):

Dynamic 4-bit Quantization, Training Qwen Models, Using Colab for Fine-tuning, Model Performance Issues, SGLang Opinions

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (23 messages🔥):

Citation Formats, Continued Pretraining, Model Comparisons, Reddit Communities


Unsloth AI (Daniel Han) ▷ #help (35 messages🔥):

Fine-tuning Llama 3, GGUF conversion issues, ReadTimeout error in Google Colab, Using multiple GPUs with Unsloth, Adapter configuration errors in training

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

Fimbulvntr's article

Link mentioned: Tweet from Fimbul (@fimbulvntr): http://x.com/i/article/1864344035466637312


Unsloth AI (Daniel Han) ▷ #research (1 messages):

edd0302: https://x.com/ruliad_ai/status/1864394941029322890


Perplexity AI ▷ #general (156 messages🔥🔥):

Amazon Nova Models, User Interface Issues, Perplexity AI Performance, Pro Subscription Concerns, Model Availability and Extensions

Links mentioned:


Perplexity AI ▷ #sharing (3 messages):

Heisenberg Heat, Software Optimization Tools, Perplexity API Functionality


Perplexity AI ▷ #pplx-api (8 messages🔥):

API Payment Issues, Enterprise Waitlist, API Quality Complaints, Support Communication, GitHub Discussion Forum

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

alexatallah: 20% price cut for Claude 3.5 Haiku!


OpenRouter (Alex Atallah) ▷ #general (148 messages🔥🔥):

Hermes 405B Free Service Status, Gemini Ultra Access, Amazon Nova Model Discussion, Model Memory Functionality, Custom Provider Keys Beta

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (6 messages):

Custom Key Beta Access


Nous Research AI ▷ #general (115 messages🔥🔥):

Distributed Training Run, Forge Reasoning API Beta, Live Memory in LLMs, Genesis of AI World Models, Nous Research Art and Design

Links mentioned:


Nous Research AI ▷ #ask-about-llms (5 messages):

Nous Research Interest, Linux from Scratch as a Benchmark, Precision in Voice Agents, Momentum Concept in Residual Stream


Nous Research AI ▷ #interesting-links (1 messages):

jellyberg: https://theaidigest.org/agent


Nous Research AI ▷ #reasoning-tasks (5 messages):

DisTro issues, Logical Consistency, DeLorean Reference

Link mentioned: no title found: no description found


Notebook LM Discord ▷ #announcements (1 messages):

NotebookLM+Spotify, Spotify Wrapped AI Podcast

Link mentioned: Listen to your first-ever 2024 Spotify Wrapped AI podcast, built with Google's NotebookLM: NotebookLM is partnering with Spotify to create a personalized Wrapped AI podcast.


Notebook LM Discord ▷ #use-cases (23 messages🔥):

AI audio generation, NotebookLM for sports journalism, Legal content simplification, Multilingual AI discussions, Creative projects using AI

Links mentioned:


Notebook LM Discord ▷ #general (92 messages🔥🔥):

Notebook LM Language Settings, Notebook LM PDF Capabilities, Notebook LM Features Requests, Google Job Listings, Notebook LM Podcast Integration

Links mentioned:


Interconnects (Nathan Lambert) ▷ #events (3 messages):

NeurIPS Meetup, Interconnects Open Hangouts


Interconnects (Nathan Lambert) ▷ #news (26 messages🔥):

Amazon Foundation Models, Genie 2, 12 Days of OpenAI, ChatGPT interface updates, Anduril and OpenAI partnership

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (18 messages🔥):

Mistral Large performance, OpenAI office in Zürich, Giffmana ethics debate

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (13 messages🔥):

Amazon's New Foundation Models, Concerns about NVIDIA's SANA Licensing, IFEval Benchmark Saturation

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (32 messages🔥):

Reward Function Design, Challenges with Stabilization, Experimentation Procedures


Interconnects (Nathan Lambert) ▷ #posts (17 messages🔥):

OLMo1 Naming Controversy, Discussion on Naming Trends, Nerdsniped Reactions


GPU MODE ▷ #general (17 messages🔥):

Efficient Gram Matrix Computation, Triton for Upper Triangle, cuBLAS and cutlass for Gram matrices, HPC Interview Expectations

Links mentioned:


GPU MODE ▷ #triton (8 messages🔥):

Triton MLIR Dialects Documentation, Grouped GEMM with TMA, Support Channels for Triton, Kernel Crashes Related to Stages, Triton Gist Issues

Links mentioned:


GPU MODE ▷ #cuda (5 messages):

Hiring for Code Work, GEMM Kernel Performance, Cache Behavior in GPU Computing


GPU MODE ▷ #beginner (30 messages🔥):

Efficient ML courses, Stanford's CS 229S course, CUDA vs Triton, MIT Han Lab course, Washington's CSE 599K course

Links mentioned:


GPU MODE ▷ #pmpp-book (7 messages):

CUDA Prerequisites, Warps Scheduling Confusion, Core Definition in GPU, Mixed Execution Units

Link mentioned: NVIDIA Ampere Architecture In-Depth | NVIDIA Technical Blog: Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU architecture. This post gives you a look&#8...


GPU MODE ▷ #off-topic (33 messages🔥):

Mastodon overview, Nuclear power and GPUs, Environmental impact of GPUs, Efficient training frameworks, AI funding news

Links mentioned:


GPU MODE ▷ #liger-kernel (1 messages):

0x000ff4: okay I have updated my PR about the kto loss


GPU MODE ▷ #🍿 (3 messages):

KernelBench, GPU kernels evaluation, Leaderboard issues

Link mentioned: GitHub - ScalingIntelligence/KernelBench: Contribute to ScalingIntelligence/KernelBench development by creating an account on GitHub.


GPU MODE ▷ #thunderkittens (1 messages):

Race Condition in TK's WGMMA+tma, Custom Kernel Implementation, Masking Technique for Matrix, Shared Memory Utilization, CUDA Version Compatibility


Stability.ai (Stable Diffusion) ▷ #general-chat (88 messages🔥🔥):

Scams and Bots in Discord, Starting with Stable Diffusion, Using ComfyUI for AI Art Generation, Troubleshooting Stable Diffusion and LoRA, Performance Analysis Tools for SD

Links mentioned:


Latent Space ▷ #ai-general-chat (72 messages🔥🔥):

Amazon Nova Models, AWS announcements, PydanticAI, OpenAI's 12 Days, Genie 2 by Google

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

swyxio: announced next week's monster paper club https://x.com/swyx/status/1864423257266639166


LM Studio ▷ #general (51 messages🔥):

LM Studio Download Issues, Performance Issues with Windows, RPG Experiment with LLM, Chat API Functionality, Local Network GPU Usage

Links mentioned:


LM Studio ▷ #hardware-discussion (13 messages🔥):

Arc Battlemage Cards, Running LMS on iGPU, Choosing Models for Writing Assistant, PCIe Configuration with 3090s


LlamaIndex ▷ #blog (9 messages🔥):

Building AI apps on Vercel, Intelligent legal document navigation, Amazon Nova foundation models, AI agents with Google Cloud connections, Super-fast RAG with LlamaIndex


LlamaIndex ▷ #general (53 messages🔥):

Summary Index Performance, Using Workflows for Chat History, AI Community Collaboration, Prompt Optimization for LLMs, Error Handling in BM25Retriever

Links mentioned:


Cohere ▷ #discussions (15 messages🔥):

Rerank 3.5 Multilingual Support, Google Gemini Functionality, Cohere Toolkit Errors, R+ Word Usage Observations, General AI Preferences


Cohere ▷ #announcements (1 messages):

Rerank 3.5, Model deprecations, Multilingual performance, Enhanced reasoning capabilities

Links mentioned:


Cohere ▷ #questions (6 messages):

API Key Types, ReRanker Performance Issues, Cohere Team Access, Model Sharing

Links mentioned:


Cohere ▷ #api-discussions (2 messages):

V3.5 Launch, Fine-Tuning API, Base Model Updates


Cohere ▷ #projects (1 messages):

Harmony project, LLM matching competition, Data harmonisation, Natural Language Processing, Discord Community

Links mentioned:


DSPy ▷ #show-and-tell (3 messages):

Pydantic AI, DSLModel development, AI Development Live Demo

Links mentioned:


DSPy ▷ #general (19 messages🔥):

DSPy Optimizations on AWS Lambda, ProgramOfThought Deprecation, Precision Evaluation in Multi-Class Classification


LLM Agents (Berkeley MOOC) ▷ #hackathon-announcements (2 messages):

Sierra AI Info Session, Hackathon Submission Form, Submission Requirements Guide, Google Forms for submissions, Judging panel and timeline

Link mentioned: LLM Agents MOOC Hackathon - Sierra Information Session: no description found


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

Certificate Declaration Form, Course Completion Tiers, Submission Checklist, Important Due Dates


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (14 messages🔥):

Project Submission Requirements, Quizzes and Certificate Deadlines, Certification Declaration Categories, Feedback on MOOC Experience

Link mentioned: Hackathon Track Submission Requirements: General Submission Requirements for All Tracks Video Presentation: Provide a link to a YouTube video (maximum 3 minutes; please upload to YT) presenting an overview of your project and demonstrating k...


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (4 messages):

GPT-4 leaks, Automated closed captioning


OpenInterpreter ▷ #general (15 messages🔥):

Anthropic Development Branch, Open Interpreter Installation Issues, Linux Compatibility, Feedback on OpenAI SDKs

Link mentioned: So Close GIF - So Close This - Discover & Share GIFs: Click to view the GIF


OpenInterpreter ▷ #O1 (1 messages):

LiveKit usage, Remote Control via O1, Computer as a Tool, CLI capabilities of OI


Torchtune ▷ #general (1 messages):

pjbontrager: I don’t know what you’re talking about 😗😅


Torchtune ▷ #dev (2 messages):

Genie 2 Foundation Model, Generalist Agents Team

Link mentioned: Genie 2: A large-scale foundation world model: Generating unlimited diverse training environments for future general agents


Torchtune ▷ #papers (12 messages🔥):

Federated learning approaches, Community-led GPU contributions, MMLU performance validation, Training timelines, Meta's technology comparison

Links mentioned:


LAION ▷ #general (5 messages):

Mechanistic Interpretability, Cellular Behavior, Epistemic Advantage


LAION ▷ #research (3 messages):

Non-commercial license concerns, EDM2 framework diffusion models, Class conditioning in diffusion models


tinygrad (George Hotz) ▷ #general (1 messages):

Web models, SAM from Meta, Tinygrad showcase

Link mentioned: Segment Anything: Meta AI Computer Vision Research


tinygrad (George Hotz) ▷ #learn-tinygrad (6 messages):

Threadgroup/Grid Sizes, BEAM Search Explanation, Shared Output Buffers in JIT, Manual Upcasting for Loops

Links mentioned:


Axolotl AI ▷ #announcements (1 messages):

Office Hours, Axolotl Survey, Axolotl Swag

Link mentioned: Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team


Axolotl AI ▷ #general (3 messages):

ADOPT optimizer, Axolotl updates

Link mentioned: Check torch version for ADOPT optimizer + integrating new ADOPT updates by bursteratom · Pull Request #2104 · axolotl-ai-cloud/axolotl: DescriptionMake sure the torch version is compatible when ADOPT optimizer is used.Incorporated latest changes to ADOPT optimizer made by original author. https://github.com/iShohei220/adoptMotiv...


Mozilla AI ▷ #announcements (1 messages):

Open Source Engineer Roles, Unternet Hiring


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (1 messages):

Gorilla Model Issue, Protobuf Dependency Error


AI21 Labs (Jamba) ▷ #general-chat (1 messages):

Ticket Messaging



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}