Frozen AI News archive

The AI Search Wars Have Begun — SearchGPT, Gemini Grounding, and more

**ChatGPT** launched its search functionality across all platforms using a fine-tuned version of **GPT-4o** with synthetic data generation and distillation from **o1-preview**. This feature includes a Chrome extension promoted by **Sam Altman** but has issues with hallucinations. The launch coincides with **Gemini** introducing Search Grounding after delays. Notably, **The New York Times** is not a partner due to a lawsuit against **OpenAI**. The AI search competition intensifies with consumer and B2B players like **Perplexity** and **Glean**. Additionally, **Claude 3.5 Sonnet** achieved a new benchmark record on SWE-bench Verified, and a new hallucination evaluation benchmark, SimpleQA, was introduced. Other highlights include the **Universal-2** speech-to-text model with 660M parameters and **HOVER**, a neural whole-body controller for humanoid robots trained in NVIDIA Isaac simulation. AI hedge fund teams using **LangChain** and **LangGraph** were also showcased. The news is sponsored by the RAG++ course featuring experts from **Weights & Biases**, **Cohere**, and **Weaviate**.

Canonical issue URL

AI News for 10/30/2024-10/31/2024. We checked 7 subreddits, 433 Twitters and 32 Discords (231 channels, and 2468 messages) for you. Estimated reading time saved (at 200wpm): 264 minutes. You can now tag @smol_ai for AINews discussions!

Teased as SearchGPT in July, ChatGPT finally rolled out its search functionality today across all platforms, completely coincidentally coinciding with Gemini launching Search Grounding after an unfortunate delay. The launch includes a simple Chrome Extension that @sama is personally promoting on Twitter and on their Reddit AMA (dont bother) today:

image.png

with a raft of weather, stocks, sports, news, and maps partners — noticeably, you will never get a New York Times article via ChatGPT because the NYT chose to sue OpenAI instead of partner with them. Partners are presumably happy about the feature, but the citations come with a catch - you have to expend an additional click to see them at all, and most will not.

image.png

CHatGPT search uses a "fine-tuned version of GPT-4o, post-trained using novel synthetic data generation techniques, including distilling outputs from OpenAI o1-preview", however it is already found to offer hallucinations.

This latest salvo in consumer AI plays challenging their search leader (Perplexity) mirrors a broader trend in b2b AI plays (Dropbox Dash) challenging their search leader (Glean).

Sounds like a good time to bone up on AI search techniques, with today's AINews sponsor!


Brought to you by the RAG++ course: Go beyond basic RAG implementations and explore advanced strategies like hybrid search and advanced prompting to optimize performance, evaluation, and deployment. Learn from industry experts at Weights & Biases, Cohere, and Weaviate how to overcome common RAG challenges and build robust AI solutions, leveraging Cohere's platform with provided credits for participants.

image.png


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Developments and Benchmarks

AI Tools and Applications

AI Research and Trends

AI Industry News and Announcements


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Apple Showcases LMStudio in MacBook Pro Ad: Local LLMs Go Mainstream

Theme 2. Meta's Llama 4: Training on 100k+ H100 GPUs for 2025 Release

Theme 3. Local AI Alternatives Challenge Cloud APIs: Cortex and Whisper-Zero

Theme 4. Optimizing LLM Inference: KV Cache Compression and New Models

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Model Developments and Capabilities

AI Tools and Interfaces

AI Ethics and Societal Impact


AI Discord Recap

A summary of Summaries of Summaries by O1-mini

Theme 1. Turbocharge Your AI: Models Get a Speed Boost

Theme 2. Fresh AI Models Hit the Scene

Theme 3. Build Smart: Advanced AI Tooling and Frameworks

Theme 4. Deployment Dilemmas: Navigating AI Infrastructure

Theme 5. Search Smarter: AI Enhancements in Information Retrieval


PART 1: High level Discord summaries

HuggingFace Discord


Nous Research AI Discord


Unsloth AI (Daniel Han) Discord


Perplexity AI Discord


OpenAI Discord


OpenRouter (Alex Atallah) Discord


aider (Paul Gauthier) Discord


Eleuther Discord


Latent Space Discord


LM Studio Discord


GPU MODE Discord


Cohere Discord


Interconnects (Nathan Lambert) Discord


Stability.ai (Stable Diffusion) Discord


Modular (Mojo 🔥) Discord


DSPy Discord


OpenInterpreter Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


LAION Discord


Gorilla LLM (Berkeley Function Calling) Discord


OpenAccess AI Collective (axolotl) Discord


LangChain AI Discord


Alignment Lab AI Discord


LLM Agents (Berkeley MOOC) Discord


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Torchtune Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

HuggingFace ▷ #announcements (1 messages):

  • Llama 3.2
  • Aya Expanse
  • Open Source Libraries
  • Model Security
  • Universal Assisted Generation

Links mentioned:


HuggingFace ▷ #general (884 messages🔥🔥🔥):

  • Hugging Face Discord Moderation
  • Llama Model Optimization
  • Text-to-Video Models
  • Experimental AI Models
  • User Behavior on Discord

Links mentioned:


HuggingFace ▷ #today-im-learning (5 messages):

  • Profiling Techniques
  • Tokenization Optimization
  • Attention Model Types
  • Seq2Seq Model Structure
  • Course Resources

HuggingFace ▷ #cool-finds (18 messages🔥):

  • AI Podcast Creation
  • OpenAI ChatGPT Search System
  • Blockchain Development
  • HuggingChat & Meta-Llama Model

Links mentioned:


HuggingFace ▷ #i-made-this (2 messages):

  • AI bug patching agent
  • Automated code reviews
  • 1-Click patch application
  • Open-source project support

Link mentioned: Standard Input - AI Software Engineer for Code Reviews: Save time and improve code quality with AI-augmented reviews and one-click patches for your pull requests.


HuggingFace ▷ #core-announcements (1 messages):

  • Native Quantization Support
  • 8-bit and 4-bit Quantization
  • Using bitsandbytes Library
  • QLoRA for Finetuning

Links mentioned:


HuggingFace ▷ #computer-vision (3 messages):

  • MolMo VLM Fine-Tuning
  • Ultralytics Installation Issues

Link mentioned: deep-learning-pytorch-huggingface/training/fine-tune-multimodal-llms-with-trl.ipynb at main · philschmid/deep-learning-pytorch-huggingface: Contribute to philschmid/deep-learning-pytorch-huggingface development by creating an account on GitHub.


HuggingFace ▷ #NLP (11 messages🔥):

  • Research Paper Objectives
  • Reading Strategies for Papers
  • Low-Rank Adapters
  • Curated Paper Lists
  • Conference Proceedings

HuggingFace ▷ #diffusion-discussions (2 messages):

  • SD3Transformer2DModel import issue
  • Diffusers 0.31 installation
  • VSCode settings

Nous Research AI ▷ #general (192 messages🔥🔥):

  • Flash-Attn Compatibility
  • Perplexity Competes with Nous
  • AI Assistants and Ecosystems
  • Apple vs PC for AI
  • Networking Hardware for AI

Links mentioned:


Nous Research AI ▷ #ask-about-llms (16 messages🔥):

  • Comparing Hermes 3 and Llama 3
  • Simulating ChatGPT's Browsing Process
  • Search behavior of LLMs
  • Alternatives to Langchain and Ollama

Link mentioned: SuperPrompt/tm_prompt.md at main · NeoVertex1/SuperPrompt: SuperPrompt is an attempt to engineer prompts that might help us understand AI agents. - NeoVertex1/SuperPrompt


Nous Research AI ▷ #interesting-links (4 messages):

  • SmolLM2 models
  • Trainings on 11 Trillion tokens
  • 135M variant capabilities

Link mentioned: HuggingFaceTB/SmolLM2-1.7B-Instruct · Hugging Face: no description found


Unsloth AI (Daniel Han) ▷ #general (121 messages🔥🔥):

  • Multi-GPU Fine Tuning
  • Quantization Techniques
  • Unsloth Framework Features
  • Fine-Tuning Stability Issues
  • New Model Releases

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

  • Hackerrank Achievements
  • Memes about Learning
  • Funny Dog Reactions

Link mentioned: Brain Dog Brian Dog GIF - Brain dog Brian dog Cooked - Discover & Share GIFs: Click to view the GIF


Unsloth AI (Daniel Han) ▷ #help (80 messages🔥🔥):

  • Unsloth Fine-Tuning
  • Inference Memory Issues
  • Flash Attention 2 and Xformers
  • CUDA Version Compatibility
  • Trainer Deprecation Notice

Links mentioned:


Unsloth AI (Daniel Han) ▷ #community-collaboration (1 messages):

  • Unsloth Docker Image

Link mentioned: no title found: no description found


Unsloth AI (Daniel Han) ▷ #research (1 messages):

edd0302: https://arxiv.org/pdf/2410.20305

Wow! Cool implementation of flexattention!


Perplexity AI ▷ #general (125 messages🔥🔥):

  • Grok 2 Model
  • Perplexity Pro Subscription Issues
  • Image Uploads in Perplexity
  • Confusion Around Search Functions
  • Comparing Perplexity to ChatGPT

Link mentioned: Tweet from Aravind Srinivas (@AravSrinivas): Been enjoying using the Grok 2 model. Now on Perplexity iOS app too for Pro users. (Restart app if you don’t see it on “Settings->AI Model”)


Perplexity AI ▷ #sharing (9 messages🔥):

  • Quantum Computing
  • Detroit: Become Human
  • People Regulation
  • Research Papers Overview
  • AI-Written Code

Perplexity AI ▷ #pplx-api (1 messages):

  • API Citations
  • Feature Availability

OpenAI ▷ #annnouncements (2 messages):

  • Reddit AMA with OpenAI Executives
  • ChatGPT search enhancement

Link mentioned: Reddit - Dive into anything: no description found


OpenAI ▷ #ai-discussions (108 messages🔥🔥):

  • GPT-4 Training Updates
  • AI Art Debate
  • OpenAI's ChatGPT Search
  • AI in Business Consulting
  • Text-To-Image Generation

Links mentioned:


OpenAI ▷ #gpt-4-discussions (2 messages):

  • GPTs file handling
  • File conflict management

OpenAI ▷ #prompt-engineering (4 messages):

  • D&D DM GPT
  • AI generation limitations

OpenAI ▷ #api-discussions (4 messages):

  • DND DM GPT
  • AI generation limitations
  • Model context expansion

OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

  • Request timeout issues
  • Network connection improvements

OpenRouter (Alex Atallah) ▷ #general (107 messages🔥🔥):

  • OpenAI Speech-to-Speech API
  • Claude 3.5 Debates
  • OpenRouter Credits and Models
  • Google Search Grounding in Gemini API
  • Llama 3.2 Usage Limits

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (7 messages):

  • Integration Feature Request

aider (Paul Gauthier) ▷ #general (100 messages🔥🔥):

  • Aider features
  • Haiku 3.5 release
  • Continue as an AI coding assistant
  • Analytics feature in Aider
  • Challenges using Aider with Ollama

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (5 messages):

  • Aider API
  • Aider Self-Scripting
  • Sonnet Performance Issues
  • State-Machine Parsing
  • ``

Link mentioned: Scripting aider: You can script aider via the command line or python.


aider (Paul Gauthier) ▷ #links (7 messages):

  • Claude Desktop App
  • Anthropic Models
  • Electron Apps

Link mentioned: Tweet from Alex Albert (@alexalbert__): We built a Claude desktop app! Now available on Mac and Windows.


Eleuther ▷ #general (1 messages):

  • Open-sourced value heads

Eleuther ▷ #research (100 messages🔥🔥):

  • Universal Transformers (UTs)
  • Deep Equilibrium Networks (DEQs)
  • Timestep Shifting in Diffusion Models
  • Gradient Descent and Fixed Points
  • Parameter Efficiency in Model Designs

Links mentioned:


Latent Space ▷ #ai-general-chat (99 messages🔥🔥):

  • Jasper AI's Growth
  • OpenAI's Search Functionality
  • ChatGPT vs. Perplexity
  • New AI Tools and Models
  • Regulatory Approaches to AI

Links mentioned:


LM Studio ▷ #announcements (1 messages):

  • venvstacks
  • Apple MLX support
  • Python dependencies

Links mentioned:


LM Studio ▷ #general (57 messages🔥🔥):

  • LM Studio Features
  • User Experiences with LM Studio
  • System Prompts in API Requests
  • Quantization in LM Studio
  • Model Performance in Storytelling

LM Studio ▷ #hardware-discussion (16 messages🔥):

  • M2 Ultra performance
  • Mistral Large usage
  • AI chip in CoPilot PCs
  • Multi-Mac processing with Llama
  • LM Studio installation on Intel Macs

GPU MODE ▷ #general (4 messages):

  • Data type conversion in tensors
  • SYCL vs. CUDA discussion
  • First CUDA project recommendations
  • Matrix multiplication optimization in CUDA

Link mentioned: How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog: In this post, I’ll iteratively optimize an implementation of matrix multiplication written in CUDA.My goal is not to build a cuBLAS replacement, but to deepl...


GPU MODE ▷ #triton (8 messages🔥):

  • Triton Debug Barrier Behavior
  • Synchronization Across Blocks
  • Triton Casting Strategies
  • Kernel Implementation for Rescaling
  • vLLM FP8 Quantization Comparison

Link mentioned: vllm/csrc/quantization/fp8/common.cu at 55650c83a0c386526ed04912a0c60eccca202f3e · vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs - vllm-project/vllm


GPU MODE ▷ #torch (10 messages🔥):

  • CUDACXX Environment Variable
  • Momentum SR Testing
  • BitsAndBytes Stochastic Variants
  • Deprecated Python APIs
  • CUDA Allocator Familiarity

Links mentioned:


GPU MODE ▷ #cool-links (5 messages):

  • Efficiency in Deep Learning
  • Blog Feedback
  • Stable Efficient Algorithms

Link mentioned: Alex L. Zhang | A Meticulous Guide to Advances in Deep Learning Efficiency over the Years: A very long and thorough guide how deep learning algorithms, hardware, libraries, compilers, and more have become more efficient.


GPU MODE ▷ #beginner (16 messages🔥):

  • Quantization techniques
  • Flash Attention implementation
  • Use of torchao
  • GPU resource challenges
  • Accuracy benchmarks among quantization approaches

GPU MODE ▷ #off-topic (3 messages):

  • Asking Questions Culture
  • Question Clarity and Research
  • Server Vibes and Community
  • Advanced Topics Discussion

GPU MODE ▷ #triton-puzzles (1 messages):

  • Triton learning
  • Trion puzzle visualization
  • Patch updates

GPU MODE ▷ #liger-kernel (1 messages):

  • Speech Processing
  • Liger Kernel Issues
  • RoPE Implementation

Link mentioned: Issues · linkedin/Liger-Kernel: Efficient Triton Kernels for LLM Training. Contribute to linkedin/Liger-Kernel development by creating an account on GitHub.


GPU MODE ▷ #thunderkittens (9 messages🔥):

  • ThunderKittens Library
  • Mamba-2 Kernel
  • Livestream Announcement

Cohere ▷ #discussions (9 messages🔥):

  • Cohere API frontend options
  • Cohere Toolkit
  • Future of chatbots in web browsing

Link mentioned: GitHub - cohere-ai/cohere-toolkit: Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.: Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications. - cohere-ai/cohere-toolkit


Cohere ▷ #questions (27 messages🔥):

  • Response Time Inquiry
  • Chatbot Browsing Simulation
  • Paper Writing Assistance
  • Aya Expanse Performance
  • Embedding Storage in ChromaDB

Links mentioned:


Cohere ▷ #api-discussions (7 messages):

  • Fine-tuning issues
  • ChatGPT browsing capabilities
  • R&D for ChatGPT alternatives

Cohere ▷ #projects (1 messages):

  • Application Review Process
  • Building Agents Experience

Cohere ▷ #cohere-toolkit (4 messages):

  • poetry installation issues
  • cohere-python package

Interconnects (Nathan Lambert) ▷ #news (5 messages):

  • Creative Writing Arena
  • SmolLM2 Launch
  • Model Evaluations on ARC

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (5 messages):

  • Midjourney Image Generation
  • Style Transfer Techniques
  • SemEval Task Scaling

Interconnects (Nathan Lambert) ▷ #ml-drama (7 messages):

  • Reproducing Issues
  • Bing Search Problems
  • GitHub Account Sketchiness

Link mentioned: Tweet from Paul Calcraft (@paul_cal): @sahir2k Bing also finds it, so it's a Bing problem if anything. Fwiw none of my private repos come up on Bing


Interconnects (Nathan Lambert) ▷ #random (18 messages🔥):

  • Llama 4 Training
  • Meta's Recruitment
  • US Elections Discussion

Links mentioned:


Interconnects (Nathan Lambert) ▷ #posts (6 messages):

  • Scarf profile pic guest
  • OG Discord friends
  • Podcast excitement

Stability.ai (Stable Diffusion) ▷ #general-chat (41 messages🔥):

  • Inpaint Tool Utility
  • Stable Diffusion Benchmarks
  • VAE Issues with Image Generation
  • Seeking Stable Diffusion Help
  • Workflow Preferences in Image Processing

Links mentioned:


Modular (Mojo 🔥) ▷ #general (5 messages):

  • Community Meeting on November 12th
  • Evan's LLVM Developers' Meeting talk
  • GPU developments in projects
  • Project collaboration efforts

Link mentioned: Modular Community Q&A: no description found


Modular (Mojo 🔥) ▷ #mojo (31 messages🔥):

  • C-style macros vs decorators
  • SQL query validation
  • Custom string interpolators
  • Static MLIR reflection
  • Algebraic types

DSPy ▷ #show-and-tell (2 messages):

  • Masters Thesis Graphic
  • CodeIt Implementation

Link mentioned: CodeIt Implementation: Self-Improving Language Models with Prioritized Hindsight Replay: CodeIt Implementation: Self-Improving Language Models with Prioritized Hindsight Replay - Codeit.md


DSPy ▷ #papers (3 messages):

  • WeKnow-RAG
  • XMC with In-Context Learning

Links mentioned:


DSPy ▷ #general (13 messages🔥):

  • DSPy Initiative Story
  • Running DSPy with Ollama
  • Chain of Thought vs Predict

Links mentioned:


OpenInterpreter ▷ #general (15 messages🔥):

  • Creating new profiles in Open Interpreter
  • Desktop client updates
  • Issues with --server command
  • OS mode limitations
  • Concerns about Anthropic API integration

Link mentioned: Profiles - Open Interpreter: no description found


OpenInterpreter ▷ #O1 (1 messages):

mikebirdtech: Did you get it working <@476060434818924544> ?


OpenInterpreter ▷ #ai-content (2 messages):

  • ChatGPT Search
  • Meta FAIR Robotics
  • Meta Sparsh
  • Meta Digit 360
  • Meta Digit Plexus

Links mentioned:


tinygrad (George Hotz) ▷ #general (3 messages):

  • NPU performance in Microsoft laptops
  • Qualcomm and Rockchip discussions
  • Open source excitement for NPU
  • TOSA as a compiler target
  • Discord community rules

tinygrad (George Hotz) ▷ #learn-tinygrad (12 messages🔥):

  • Tinygrad Model Exporting
  • Hailo Chip Reverse Engineering
  • Tensor Assignment in Lazy.py
  • ONNX Interfacing
  • BufferCopy vs CompiledRunner Issues

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

  • automated research paper report generation
  • Open Telemetry integration

LlamaIndex ▷ #general (9 messages🔥):

  • Llamaparse challenges
  • Milvus database field standardization
  • Custom retriever with additional metadata
  • QueryFusionRetriever
  • Named Entity Recognition (NER) integration

LAION ▷ #general (5 messages):

  • Food Detection Models
  • Autoregressive Image Generation
  • Patch Artifacts
  • Variational Autoencoders

LAION ▷ #research (1 messages):

mkaic: https://arxiv.org/abs/2410.23168


Gorilla LLM (Berkeley Function Calling) ▷ #leaderboard (6 messages):

  • Parameter type errors
  • Evaluating custom models
  • Model Response Generation

OpenAccess AI Collective (axolotl) ▷ #general (2 messages):

  • SageAttention quantization
  • Axolotl Docker image release strategy

Link mentioned: SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration: The transformer architecture predominates across various models. As the heart of the transformer, attention has a computational complexity of O(N^2), compared to O(N) for linear transformations. When ...


OpenAccess AI Collective (axolotl) ▷ #general-help (1 messages):

  • H100 compatibility
  • bitsandbytes updates

LangChain AI ▷ #general (1 messages):

  • Hugging Face Docs
  • Custom Models

LangChain AI ▷ #share-your-work (1 messages):

  • Chat Applications
  • Ollama

Alignment Lab AI ▷ #ai-and-ml-discussion (1 messages):

tpojd: steam gift 50$ - steamcommunity.com/gift-card/pay/50
@everyone


Alignment Lab AI ▷ #general (1 messages):

tpojd: steam gift 50$ - steamcommunity.com/gift-card/pay/50
@everyone


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (1 messages):

evilspartan98: Interested







{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}