Frozen AI News archive

Contextual Document Embeddings: `cde-small-v1`

**Meta** announced a new text-to-video model, **Movie Gen**, claiming superior adaptation of **Llama 3** to video generation compared to OpenAI's Sora Diffusion Transformers, though no release is available yet. Researchers Jack Morris and Sasha Rush introduced the **cde-small-v1** model with a novel **contextual batching** training technique and **contextual embeddings**, achieving strong performance with only **143M parameters**. **OpenAI** launched Canvas, a collaborative interface for ChatGPT with synthetic data training. **Google DeepMind** welcomed Tim Brooks to work on video generation and world simulators. Google released **Gemini 1.5 Flash-8B**, improving cost and rate limits with algorithmic efficiency.

Canonical issue URL

AI News for 10/3/2024-10/4/2024. We checked 7 subreddits, 433 Twitters and 31 Discords (226 channels, and 1896 messages) for you. Estimated reading time saved (at 200wpm): 210 minutes. You can now tag @smol_ai for AINews discussions!

We often give the top story on AINews to movements of the big model labs, and today Meta's new text to video model, Movie Gen, is sweeping the news, with a paper that notably claims that they were able to adapt Llama 3 to video generation much better than OpenAI Sora's Diffusion Transformers. However, there is no actual release, just cherrypicked marketing videos, and we try to focus on news you can use here.

So we are happy to highlight Jack Morris and Sasha Rush's new paper and cde-small-v1 model on Contextual Document Embeddings, "the best BERT-sized text embedding model in the world".

image.png

Jack puts it best:

"Typical text embedding models have two main problems:

  1. training them is complicated and requires many tricks: giant batches, distillation, hard negatives...
  2. the embeddings don't "know" what corpus they will be used in; consequently, all text spans are encoded the same way"

To fix (1) we develop a new training technique: contextual batching. all batches share a lot of context – one batch might be about horse races in Kentucky, the next batch about differential equations, etc.

And for (2), we propose a new contextual embedding architecture. this requires changes to both the training and evaluation pipeline to incorporate contextual tokens – essentially, model sees extra text from the surrounding context, and can update the embedding accordingly

This seems to make sense - priming the embeddings model to adapt to context tokens first before doing proper embeddings.

While most leaderboard-topping embeddings models are >7B in size (scoring ~72 on MTEB), the 143M parameter cde-small-v1 scores a respectable 65 while sitting comfortably between models 50x larger. A nice efficiency win.

image.png

While you're exploring new embeddings models, you might want to explore other advanced RAG techniques from today's sponsor!


Brought to you by RAG++: Query refinement for RAG is like giving your system X-ray vision; with it, the system can “see“ user intentions more clearly - leading to more accurate chunk retrieval and more relevant LLM responses.

image.png

Learn about improving your RAG query refinement in this YouTube excerpt from Weights & Biases’ new course RAG++ : From POC to Production and sign up for free LLM api credits to get you started!


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model and Company Updates

AI Research and Techniques

Industry Trends and Applications


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Whisper Turbo: Significant Speed Improvements in Speech Recognition

Theme 2. Qwen 2.5: Controversy Over Chinese AI Models in Conservative Industries

Theme 3. XTC Sampler: New Technique to Reduce GPTisms in LLM Outputs

Theme 4. Tool Calling in Open-Source LLMs: Building Agentic AI Systems

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Research and Techniques

AI Model Releases and Improvements

AI Industry Developments

AI Ethics and Societal Impact

AI Capabilities and Milestones


AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1: Meta Unveils Movie Gen, Revolutionizes Video Generation

Theme 2: New AI Models and Benchmarks Lead the Charge

Theme 3: Advances in Model Optimization and Training Techniques

Theme 4: OpenAI's Canvas Tool and Models Stir Mixed Reactions

Theme 5: Recurrent Neural Networks Make a Comeback


PART 1: High level Discord summaries

Nous Research AI Discord


aider (Paul Gauthier) Discord


HuggingFace Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


Eleuther Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Latent Space Discord


GPU MODE Discord


Perplexity AI Discord


Cohere Discord


Stability.ai (Stable Diffusion) Discord


LAION Discord


LLM Agents (Berkeley MOOC) Discord


LlamaIndex Discord


DSPy Discord


OpenInterpreter Discord


LangChain AI Discord


Interconnects (Nathan Lambert) Discord


tinygrad (George Hotz) Discord


Torchtune Discord


Modular (Mojo 🔥) Discord


OpenAccess AI Collective (axolotl) Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Nous Research AI ▷ #general (254 messages🔥🔥):

  • torchao library by PyTorch
  • OpenAI's Canvas tool
  • Meta's Movie Gen models
  • Cultural biases in AI training
  • Nous Forge Framework

Links mentioned:


Nous Research AI ▷ #ask-about-llms (1 messages):

lukfbi: Guys, please help me, what is the best temperature for RPG and RP on the Hermes 70b?


Nous Research AI ▷ #research-papers (2 messages):

  • VinePPO algorithm
  • Pluralistic alignment
  • Model steerability benchmarks

Links mentioned:


Nous Research AI ▷ #research-papers (2 messages):

  • VinePPO
  • Pluralistic Alignment
  • Model Steerability Evaluation

Links mentioned:


Nous Research AI ▷ #reasoning-tasks (2 messages):

  • OpenAI's model outputs
  • Open-source reasoning models

aider (Paul Gauthier) ▷ #general (140 messages🔥🔥):

  • Telemetry in Aider
  • OpenRouter Free Models Limitations
  • Benchmarking Aider with Various Models
  • Transition of Aider Repo Ownership
  • User Experiences with Aider Performance

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (69 messages🔥🔥):

  • Using Aider with Ollama
  • File addition in Aider
  • Aider performance and models
  • Repo map functionality
  • Aider modes for querying

Links mentioned:


aider (Paul Gauthier) ▷ #links (2 messages):

  • Aider mentions
  • Hybrid search with SQLite
  • Reciprocal Rank Fusion

Link mentioned: Hybrid full-text search and vector search with SQLite: As part of Alex’s work on his sqlite-vec SQLite extension - adding fast vector lookups to SQLite - he’s been investigating hybrid search, where search results f...


HuggingFace ▷ #announcements (1 messages):

  • Salamandra on-device demo
  • OpenAI models update
  • Nemo-Mistral-Minitron improvements
  • Realtime Whisper Turbo
  • MusicGen iOS app progress

Links mentioned:


HuggingFace ▷ #general (151 messages🔥🔥):

  • Meta Movie Gen
  • Hugging Face Chat support
  • Gradio Chatbot UI
  • Model usage
  • InstantMesh

Links mentioned:


HuggingFace ▷ #today-im-learning (5 messages):

  • Sanity 70B FP8
  • CUDA
  • μP
  • HuggingFace model upload
  • Outdated tutorials

HuggingFace ▷ #cool-finds (6 messages):

  • New AI Model from Nvidia
  • Music Composer on HuggingFace
  • Text to Singing Model

Link mentioned: Midi Music Generator - a Hugging Face Space by skytnt: no description found


HuggingFace ▷ #i-made-this (10 messages🔥):

  • Salamandra-2B-Instruct Release
  • Fastai Convolution Explanation
  • Nvidia Model Updates
  • New Labeling Tool for LLMs
  • Llava Video Understanding Model

Links mentioned:


HuggingFace ▷ #reading-group (4 messages):

  • AI Sentience Prediction
  • Original Research Sharing
  • Weekly Reading Group

Link mentioned: The Sentience Prediction Equation: When Will AI Achieve Sentience? (And Should We Be Worried?): You’ve heard the buzz: AI is getting smarter. It’s writing novels, making memes, diagnosing diseases, and even, well, generating this very…


HuggingFace ▷ #computer-vision (1 messages):

  • Model Training Explained
  • Conceptual Learning vs Instructional Learning
  • Catastrophic Forgetting

HuggingFace ▷ #NLP (8 messages🔥):

  • Spacy's online training module
  • Fine-tuning with custom datasets
  • Using SFTTrainer for language models
  • ONNX model conversion issues
  • Transformers.js integration

Links mentioned:


HuggingFace ▷ #diffusion-discussions (3 messages):

  • Flux model restrictions
  • Hacktoberfest contributions

OpenAI ▷ #ai-discussions (134 messages🔥🔥):

  • Canvas Model
  • OpenAI Tools
  • Advanced Voice Mode
  • Discord Bots
  • AI Programming

Links mentioned:


OpenAI ▷ #gpt-4-discussions (11 messages🔥):

  • Custom GPTs with Google API
  • Custom GPTs Model Queries
  • Canvas Issues
  • ChatGPT Counting and Math Concerns

OpenAI ▷ #prompt-engineering (8 messages🔥):

  • Inconsistencies in ChatGPT evaluations
  • Embedding images in Newl Canvas
  • Efficient parsing of snippets to JSON
  • Model scoring techniques

OpenAI ▷ #api-discussions (8 messages🔥):

  • ChatGPT evaluations consistency
  • Using images in Newl Canvas
  • JSON parsing with GPT-4o
  • Grading rubric for evaluations
  • Chain-of-Thought in evaluations

Unsloth AI (Daniel Han) ▷ #general (77 messages🔥🔥):

  • Unsloth AI Projects
  • Lora Configuration in PEFT
  • Fine-tuning Models
  • ZLUDA Project Update
  • Movie Gen AI Model

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (43 messages🔥):

  • Generational Identity
  • Gen Z Preferences
  • Lego vs. Modded Minecraft

Unsloth AI (Daniel Han) ▷ #help (31 messages🔥):

  • Local inference with llama-cpp
  • Multi-GPU support
  • Fine-tuning models with plain text
  • Preparing datasets for training
  • Running LLM on mobile with Flutter

Link mentioned: Unsloth Notebooks | Unsloth Documentation: See the list below for all our notebooks:


Unsloth AI (Daniel Han) ▷ #research (6 messages):

  • Nanoflow framework
  • Recurrent Neural Networks revival
  • SageAttention quantization
  • Code replacement suggestion

Links mentioned:


Eleuther ▷ #general (94 messages🔥🔥):

  • IREE adoption and compilation
  • RWKV and parallelization
  • Chain of Thought (CoT) output limitations
  • Gated Linear Attention and models expressible as RNNs
  • MATS Program and mentorship opportunities

Links mentioned:


Eleuther ▷ #research (49 messages🔥):

  • VinePPO Challenges
  • minLSTMs and minGRUs
  • Transfer Learning in Math
  • Softmax Function Limitations
  • Test Time Training (TTT)

Links mentioned:


Eleuther ▷ #gpt-neox-dev (1 messages):

  • lm-evaluation-harness
  • GPT-NeoX improvements

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

  • SambaNova AI on OpenRouter
  • Gemini 1.5 Flash-8B Release

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (140 messages🔥🔥):

  • Gemini 1.5 Flash
  • o1 Mini performance
  • Anthropic's model development
  • Model alignment techniques
  • OpenRouter infrastructure updates

Links mentioned:


LM Studio ▷ #general (127 messages🔥🔥):

  • LM Studio Updates
  • Memory Leak Issues
  • Model Downloading and Integration
  • Chat Cache Location
  • AI Model Recommendations

Links mentioned:


Latent Space ▷ #ai-general-chat (18 messages🔥):

  • LangChain Voice ReAct Agent
  • GPT-4o Dialogue
  • Meta Movie Gen Breakthrough
  • New LLM Leaderboard for Finance
  • Contextual Information Embedding Model

Links mentioned:


Latent Space ▷ #ai-in-action-club (98 messages🔥🔥):

  • Discord audio issues
  • Luma AI applications
  • Gaussian splatting
  • 3D modeling in gaming
  • Virtual meetings

Links mentioned:


GPU MODE ▷ #torch (1 messages):

  • Performance benchmarks
  • Fio tools
  • Data access methods

GPU MODE ▷ #cool-links (2 messages):

  • SageAttention
  • Meta Movie Gen

Links mentioned:


GPU MODE ▷ #pmpp-book (2 messages):

  • Book Updates
  • Chapter Upgrades

GPU MODE ▷ #youtube-recordings (4 messages):

  • Event Planning
  • Colocation with Conferences
  • Planning Timelines

GPU MODE ▷ #torchao (4 messages):

  • Noncontiguous inputs in Torchao
  • OptimState8bit dispatch error
  • AdamW8bit compatibility with Accelerate

GPU MODE ▷ #off-topic (22 messages🔥):

  • OpenAI's Financial Success
  • Potential New Products from OpenAI
  • Resume Review Channel Proposal
  • Grad School Application Discussions

GPU MODE ▷ #triton-puzzles (1 messages):

  • Triton kernel performance
  • Tensor operations
  • Debugging Triton functions

GPU MODE ▷ #bitnet (1 messages):

  • BF16 stochastic rounding
  • Grad norm analysis
  • Data shuffling concerns

GPU MODE ▷ #liger-kernel (11 messages🔥):

  • Conv2d Triton Kernel Performance
  • Scaled Int8 Conv2d Exploration
  • Liger vs. PyTorch Performance
  • Fused KL/JSD Requirement Clarification

GPU MODE ▷ #self-promotion (5 messages):

  • Hyperparameter Scaling Guide
  • Open Source Project Maintenance
  • Embedding Geometries Paper Acceptance
  • Contrastive Language-Image Pre-Training
  • Euclidean vs Hyperbolic Geometry

Links mentioned:


GPU MODE ▷ #avx (48 messages🔥):

  • AVX2 Emulation
  • Matrix Multiplication Implementation
  • Performance Testing
  • Parallel Programming Resources
  • Tinygrad with AVX Intrinsics

Links mentioned:


Perplexity AI ▷ #general (65 messages🔥🔥):

  • Perplexity AI Collections UI
  • Boeing 777-300ER Specifications
  • TradingView Premium Package
  • Llama 3.2 Release
  • Claude 3.5 vs Other Models

Links mentioned:


Perplexity AI ▷ #sharing (5 messages):

  • U2V
  • Kreutzer's Etudes
  • Four-legged Robot
  • Quantum Clocks
  • Enum Values

Cohere ▷ #discussions (2 messages):

  • Command R 08-2024 Update
  • Integration with Weights & Biases

Link mentioned: Updates to Command R Fine-tuning: Fine-tune the updated Command R 08-2024 with support for newer options giving you more control and visibility including a seamless integration with Weights & Biases.


Cohere ▷ #questions (39 messages🔥):

  • Metrics Visibility Issues
  • Fine-Tuning Challenges
  • Tool Use in Next.js
  • RAG with Embedding Datasets
  • UI Feedback on Colabs

Links mentioned:


Cohere ▷ #api-discussions (5 messages):

  • Pricing Discrepancy
  • Finetuning Commands
  • Documentation Updates

Cohere ▷ #projects (1 messages):

kittykills: Hello!


Stability.ai (Stable Diffusion) ▷ #general-chat (44 messages🔥):

  • OpenPose Alternatives
  • ComfyUI Image Quality
  • SDXL Models
  • Reference Image Generation
  • AI Tools for Object Placement

LAION ▷ #general (4 messages):

  • Translation of Technical Language
  • Language Barriers in Tech

LAION ▷ #research (14 messages🔥):

  • MinGRU Architecture
  • Training Bark-like Models
  • Scam Alert

Links mentioned:


LAION ▷ #resources (1 messages):

  • Earning Opportunities
  • Telegram Contact

Link mentioned: Hugo Larsson: The secret of getting ahead is getting started 🤝


LAION ▷ #learning-ml (2 messages):

  • Training BARK Model
  • Earn Money Quickly

Link mentioned: Hugo Larsson: The secret of getting ahead is getting started 🤝


LAION ▷ #paper-discussion (1 messages):

  • Earn $50k in 72 hours
  • Telegram outreach

Link mentioned: Hugo Larsson: The secret of getting ahead is getting started 🤝


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (19 messages🔥):

  • Article Score Inquiry
  • Real-time Streaming of Responses
  • Chainlit Integration
  • Github Autogen Pull Requests
  • Course Location on Campus

Link mentioned: process message before send by sonichi · Pull Request #1783 · microsoft/autogen: Why are these changes needed? Add a hookable method for processing a message before sending. Example application: customized frontend to display messages . Renamed other hookable methods for clari...


LlamaIndex ▷ #blog (5 messages):

  • Building AI agents with LlamaCloud
  • Security in RAG
  • Real-time audio APIs from OpenAI
  • Avoiding hallucination in RAG
  • Hackathon announcement

LlamaIndex ▷ #general (11 messages🔥):

  • Agent Class with Streaming
  • Integrating LLM with BigQuery
  • Error Handling in Code
  • OpenAIAgent for Streaming
  • Custom Agent Development

Link mentioned: Google Colab: no description found


DSPy ▷ #show-and-tell (7 messages):

  • dslmodel live demos
  • Sentiment Analysis
  • Document Summarization
  • Arxiv Paper Structure
  • New Features in DSLModel

Links mentioned:


DSPy ▷ #general (4 messages):

  • DSPy full form
  • Backronym for DSPy

DSPy ▷ #examples (4 messages):

  • Text Classification Tasks
  • DSPy Signatures
  • LM Behavior Specification

Link mentioned: Signatures | DSPy: When we assign tasks to LMs in DSPy, we specify the behavior we need as a Signature.


OpenInterpreter ▷ #general (7 messages):

  • Event Participation Limit
  • Human Devices Event
  • Obelisk GitHub Tool

Link mentioned: GitHub - go-shiori/obelisk: Go package and CLI tool for saving web page as single HTML file: Go package and CLI tool for saving web page as single HTML file - go-shiori/obelisk


OpenInterpreter ▷ #O1 (1 messages):

ellsies_: no logs at all


OpenInterpreter ▷ #ai-content (5 messages):

  • Meta Movie Gen
  • Open Source Discussion

Link mentioned: Tweet from AI at Meta (@AIatMeta): 🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capa...


LangChain AI ▷ #general (12 messages🔥):

  • FAANG SDLC certifications
  • LangChain API updates
  • LangChain support for GPT real-time API
  • Evaluating RAG pipelines
  • Creating a chatbot with LangChain

Interconnects (Nathan Lambert) ▷ #news (3 messages):

  • NeurIPS 2024 Conference Date Change
  • Elon Musk's xAI Recruiting Event
  • OpenAI's Dev Day
  • Funding Rumors

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (8 messages🔥):

  • Meta Movie Gen
  • Model Optimization Techniques
  • LLMs and Code Synthesis Reinforcement Learning
  • OpenAI's Model Distillation
  • Canvas Development

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (1 messages):

natolambert: Should I make this a real poster at a conference?


tinygrad (George Hotz) ▷ #general (4 messages):

  • Permuting vs Reshaping Tensors
  • Stable Diffusion Model Training
  • Tinygrad CI Warnings
  • Analysis of CI Test Failures

Link mentioned: node cleanup + local metal test speed [pr] · tinygrad/tinygrad@2a8b305: You like pytorch? You like micrograd? You love tinygrad! ❤️ - node cleanup + local metal test speed [pr] · tinygrad/tinygrad@2a8b305


tinygrad (George Hotz) ▷ #learn-tinygrad (2 messages):

  • bfloat16 tests
  • Triton talks

Torchtune ▷ #general (1 messages):

leoandlibe: Hey guys, does torchtune support KTO training?~


Torchtune ▷ #dev (5 messages):

  • VinePPO
  • Flex Attention
  • Batch Size Optimization
  • Distributed Data Parallel (DDP)

Link mentioned: Tweet from Amirhossein Kazemnejad (@a_kazemnejad): VinePPO, a straightforward modification to PPO, unlocks RL’s true potential for LLM Reasoning. It beats RL-free methods (DPO and RestEM) and PPO, surpassing it in less steps(up to 9x), less time(up t...


Modular (Mojo 🔥) ▷ #mojo (4 messages):

  • Network Speed Improvements
  • Software Limitations
  • 100 Gbps Technology
  • Latency vs Throughput
  • AI Contributions to Networking

OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (1 messages):

  • axolotl packaging
  • dependency management







{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}