Frozen AI News archive

Summer of Code AI: $1.6b raised, 1 usable product

**Code + AI** is emphasized as a key modality in AI engineering, highlighting productivity and verifiability benefits. Recent major funding rounds include **Cognition AI raising $175M**, **Poolside raising $400M**, **Codeium AI raising $150M**, and **Magic raising $320M**. Magic announced their **LTM-2** model with a **100 million token context window**, boasting efficiency improvements over **Llama 3.1 405B** by about **1000x cheaper** in sequence-dimension algorithm and drastically lower memory requirements. Magic's stack is built from scratch with custom CUDA and no open-source foundations, partnered with **Google Cloud** and powered by **NVIDIA H100** and **GB200 GPUs**, aiming to scale to tens of thousands of GPUs. Google DeepMind revealed updates to **Gemini Advanced** with customizable expert "Gems." Neural Game Engines like **GameNGen** can run DOOM in a diffusion model trained on **0.9B frames**. The content also references **LLM quantization** research by Rohan Paul.

Canonical issue URL

AI News for 8/28/2024-8/29/2024. We checked 7 subreddits, 400 Twitters and 30 Discords (213 channels, and 2980 messages) for you. Estimated reading time saved (at 200wpm): 338 minutes. You can now tag @smol_ai for AINews discussions!

One of the core theses in the Rise of the AI Engineer is that code is first among equals among the many modalities that will emerge. Above the obvious virtuous cycle (code faster -> train faster -> code faster), it also has the nice property of being 1) internal facing (so lower but nonzero liability of errors), 2) improving developer productivity (one of the most costly headcounts), 3) verifiable/self-correcting (in the Let's Verify Step by Step sense).

This Summer of Code kicked off with:

Today, we have:

While Codeium is the only product of the 4 you can actually use today, Magic's announcement is the more notable one, because of their promising long context utilization (powered by HashHop) and efficiency details teased by Nat Friedman in the previous raise:

For each decoded token, LTM-2-mini’s sequence-dimension algorithm is roughly 1000x cheaper than the attention mechanism in Llama 3.1 405B1 for a 100M token context window. The contrast in memory requirements is even larger – running Llama 3.1 405B with a 100M token context requires 638 H100s per user just to store a single 100M token KV cache. In contrast, LTM requires a** small fraction of a single H100’s HBM per user** for the same context.

image.png

This was done with a completely-written-from-scratch stack:

To train and serve 100M token context models, we needed to write an entire training and inference stack from scratch (no torch autograd, lots of custom CUDA, no open-source foundations) and run experiment after experiment on how to stably train our models.

They also announced a Google Cloud partnership:

Magic-G4, powered by NVIDIA H100 Tensor Core GPUs, and Magic-G5, powered by NVIDIA GB200 NVL72, with the ability to scale to tens of thousands of Blackwell GPUs over time.

They mention 8000 h100s now, but "over time, we will scale up to tens of thousands of GB200s" with former OpenAI Supercomputing Lead Ben Chess.

Their next frontier is inference-time compute:

Imagine if you could spend $100 and 10 minutes on an issue and reliably get a great pull request for an entire feature. That’s our goal.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Developments and Applications

AI Infrastructure and Performance

AI Applications and Research

AI Development Practices and Tools

AI Ethics and Regulation


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Innovative Local LLM User Interfaces

Theme 2. Advancements in Large Language Model Capabilities

Theme 3. Challenges in Evaluating AI Intelligence and Reasoning

All AI Reddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Research and Techniques

AI Model Releases and Improvements

AI Impact on Industry and Employment

Technical Details and Discussions


AI Discord Recap

A summary of Summaries of Summaries by GPT4O (gpt-4o-2024-05-13)

1. LLM Advancements

2. Model Performance Optimization

3. Fine-tuning Strategies

4. Open Source AI Developments

5. AI Community and Events


PART 1: High level Discord summaries

LM Studio Discord


OpenAI Discord


Stability.ai (Stable Diffusion) Discord


Unsloth AI (Daniel Han) Discord


Perplexity AI Discord


Cohere Discord


LlamaIndex Discord


Latent Space Discord


OpenInterpreter Discord


OpenAccess AI Collective (axolotl) Discord


LangChain AI Discord


Torchtune Discord


DSPy Discord


AI21 Labs (Jamba) Discord


tinygrad (George Hotz) Discord


Gorilla LLM (Berkeley Function Calling) Discord


LAION Discord


Alignment Lab AI Discord


Mozilla AI Discord


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

LM Studio ▷ #general (161 messages🔥🔥):

  • LLM image comparison
  • LLM vision tasks
  • LLM speed
  • LLM custom instructions
  • LLM RAG

Links mentioned:


LM Studio ▷ #hardware-discussion (67 messages🔥🔥):

  • PCIE 5.0
  • llama.cpp
  • NPU support
  • Llama 70b
  • Multi-GPU setup

Links mentioned:


OpenAI ▷ #ai-discussions (215 messages🔥🔥):

  • Gemini's capabilities
  • LLM personalization
  • Memory and context in LLMs
  • Fine-tuning vs. prompt engineering
  • OpenAI API usage and cost

Link mentioned: Tweet from Ignacio de Gregorio (@TheTechOasis1): http://x.com/i/article/1827379585861709824


OpenAI ▷ #gpt-4-discussions (7 messages):

  • LLM Model Performance
  • OpenAI Model Limitations
  • GPT-4 vs GPT-4o
  • Llama 3 vs OpenAI Models

OpenAI ▷ #prompt-engineering (2 messages):

  • ChatGPT Persona

OpenAI ▷ #api-discussions (2 messages):

  • ChatGPT persona

Stability.ai (Stable Diffusion) ▷ #general-chat (184 messages🔥🔥):

  • SDXL Background Issues
  • Lora Creation
  • Model Merging
  • ComfyUI
  • Regularization

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (93 messages🔥🔥):

  • Unsloth vs OpenRLHF
  • Unsloth finetuning
  • Unsloth multi-GPU
  • Unsloth inference
  • Unsloth on AWS

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (1 messages):

  • ML model deployment challenges
  • LLM limitations
  • Survey for ML Professionals

Link mentioned: LLM Problems research: Hey, I’m working on an ML project and I need help from people who are building models and deploying them to production.


Unsloth AI (Daniel Han) ▷ #help (29 messages🔥):

  • Gemma2:2b Fine-tuning
  • Unsloth for Fine-tuning
  • Function Calling Datasets
  • APIGen
  • Mistral Fine-tuning

Links mentioned:


Unsloth AI (Daniel Han) ▷ #community-collaboration (1 messages):

hamchezz: I want to finetune a llm on some undefined goal just because 😄


Unsloth AI (Daniel Han) ▷ #research (1 messages):

  • Runpod pricing
  • LLaMa 4 MoE
  • Flexattention
  • Unsloth Pro training

Perplexity AI ▷ #announcements (1 messages):

  • Discord Community Growth

Perplexity AI ▷ #general (97 messages🔥🔥):

  • Perplexity Pro issues
  • Perplexity AI Issues
  • AI model limitations
  • Perplexity AI model selection
  • Perplexity AI usability

Links mentioned:


Perplexity AI ▷ #sharing (9 messages🔥):

  • MrBeast
  • Perplexity AI Discord
  • Anthropic's Claude
  • Kustom.tech
  • OpenAI's Threads

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (14 messages🔥):

  • Perplexity API
  • Beta Application
  • Telegram Chatbot
  • Temu Promo Bots
  • Free API Credits

Link mentioned: API Beta access: Hi, How long does it take to get accepted for the Beta use? I want to test the citation return feature, we applied for it multiple times in the last months and didn't hear a word. We have users who ...


Cohere ▷ #discussions (38 messages🔥):

  • LLM Tokenization
  • Sycophancy Behavior in Models
  • MMlu Issues
  • COT & Scratch Pad Evaluation

Link mentioned: joey234/mmlu-human_sexuality-original-neg · Datasets at Hugging Face: no description found


Cohere ▷ #questions (28 messages🔥):

  • Cohere for AI Scholars Program
  • Cohere for AI Community
  • Cohere API
  • CrewAI
  • Aya-23-8b Inference Time

Links mentioned:


Cohere ▷ #projects (1 messages):

  • ``

LlamaIndex ▷ #blog (2 messages):

  • LlamaIndex Workflows
  • GymNation Case Study

LlamaIndex ▷ #general (37 messages🔥):

  • Function Calling LLMs
  • Workflows
  • Image & Text Retrieval
  • LlamaIndex Integration
  • Pinecone Vector Store

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

  • GenAI Ops
  • GenAI Ops Community
  • GenAI Ops Book

Link mentioned: Exploring GenAIOps: Empowering Leaders and Innovators: Operationalising Generative AI: Amazon.co.uk: Kirby, Harrison: 9798334554955: Books: no description found


Latent Space ▷ #ai-general-chat (33 messages🔥):

  • Agency Fundraise
  • AI Engineer Meetup & Summit
  • AI for Individual Use
  • Midjourney Hardware
  • Llama 3 Open Source Adoption

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

  • Latent Space Podcast
  • LLM Benchmarks
  • Nicholas Carlini
  • Google DeepMind
  • Training Data Extraction

Link mentioned: Tweet from Latent.Space (@latentspacepod): 🆕 Why you should write your own LLM benchmarks w/ Nicholas Carlini of @GoogleDeepMind Covering his greatest hits: - How I Use AI - My benchmark for large language models - Extracting Training Data...


OpenInterpreter ▷ #general (9 messages🔥):

  • OpenInterpreter development
  • Auto-run safety
  • Backups
  • House Party
  • Terminal app recommendations

Link mentioned: Commits · OpenInterpreter/open-interpreter: A natural language interface for computers. Contribute to OpenInterpreter/open-interpreter development by creating an account on GitHub.


OpenInterpreter ▷ #ai-content (17 messages🔥):

  • Daily Bots
  • Bland
  • AI Phone Agents
  • Frame
  • Diffusion Models

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (16 messages🔥):

  • Macbook pro training
  • GPU vs. CPU
  • Training speed
  • Model size
  • Training cost

Link mentioned: Replete-AI (Replete-AI): no description found


OpenAccess AI Collective (axolotl) ▷ #general-help (2 messages):

  • Fine-tuning LLMs for dialogue
  • Data Streamlining

OpenAccess AI Collective (axolotl) ▷ #datasets (1 messages):

teknium: https://x.com/nousresearch/status/1829143753036366325?s=46


LangChain AI ▷ #general (15 messages🔥):

  • SQLDatabaseChain
  • Vector Stores
  • SQL Record Manager
  • RAG (Retrieval Augmented Generation)
  • Knowledge Graphs

Links mentioned:


Torchtune ▷ #general (7 messages):

  • Torchtune Contributing
  • QLoRA + Llama 3.1 Memory Issues
  • Torchtune Github Issues

Link mentioned: Issues · pytorch/torchtune: A Native-PyTorch Library for LLM Fine-tuning. Contribute to pytorch/torchtune development by creating an account on GitHub.


Torchtune ▷ #dev (7 messages):

  • Fusion Models RFC
  • Batched Inference
  • Decoder-only Max Seq Len
  • Flamingo Model
  • Cache Position Tracking

Links mentioned:


DSPy ▷ #show-and-tell (7 messages):

  • LinkedIn Job Applier
  • Agent Zero
  • GitHub repo
  • AIHawk
  • Pipelines

Links mentioned:


DSPy ▷ #papers (2 messages):

  • Generative Reward Models (GenRM)
  • DSPy Optimizers

Link mentioned: Generative Verifiers: Reward Modeling as Next-Token Prediction: Verifiers or reward models are often used to enhance the reasoning performance of large language models (LLMs). A common approach is the Best-of-N method, where N candidate solutions generated by the ...


DSPy ▷ #general (4 messages):

  • DSPY
  • Optimizers
  • KL Divergence
  • Synthetic Data
  • Human Responses

AI21 Labs (Jamba) ▷ #jamba (11 messages🔥):

  • Jamba 1.5 dependency issues
  • transformers version bug

Link mentioned: transformers 4.44.2 on Python PyPI: New release transformers version 4.44.2 Release v4.44.2 on Python PyPI.


tinygrad (George Hotz) ▷ #general (1 messages):

  • Tinygrad Performance
  • Static Scheduling
  • Sparse Operations

tinygrad (George Hotz) ▷ #learn-tinygrad (7 messages):

  • ReduceOp Merging in Tinygrad
  • tinygrad's FUSE_CONV_BW Flag
  • Tinygrad Documentation for Beginners

Link mentioned: tinygrad/tinygrad/engine/schedule.py at cb61cfce2492e53dac4691e92774e2704351b3ed · tinygrad/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ❤️ - tinygrad/tinygrad


Gorilla LLM (Berkeley Function Calling) ▷ #discussion (3 messages):

  • Groq Leaderboard

LAION ▷ #general (1 messages):

spirit_from_germany: https://youtu.be/DP454c1K_vQ?si=qYWw6oU0sQC9FPv4


LAION ▷ #research (1 messages):

  • CLIP-AGIQA
  • AI-Generated Image Quality Assessment
  • CLIP for image quality assessment
  • Generative technologies
  • AIGIs

Link mentioned: CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP: With the rapid development of generative technologies, AI-Generated Images (AIGIs) have been widely applied in various aspects of daily life. However, due to the immaturity of the technology, the qual...


Alignment Lab AI ▷ #general (1 messages):

teknium: https://x.com/nousresearch/status/1829143753036366325?s=46


Mozilla AI ▷ #announcements (1 messages):

  • Common Voice
  • Speech Data

Link mentioned: Element: no description found




{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}