Frozen AI News archive

Grok 2! and ChatGPT-4o-latest confuses everybody

**OpenAI** quietly released a new **GPT-4o** model in ChatGPT, distinct from the API version, reclaiming the #1 spot on Lmsys arena benchmarks across multiple categories including math, coding, and instruction-following. Meanwhile, **X.ai** launched **Grok 2**, outperforming **Claude 3.5 Sonnet** and previous GPT-4o versions, with plans for enterprise API release. Grok 2 integrates **Black Forest Labs' Flux.1**, an open-source text-to-image model surpassing **Stable Diffusion 3**. **Google DeepMind** announced **Gemini Advanced** with enhanced conversational features and Pixel device integration. AI researcher **ylecun** highlighted LLM limitations in learning and creativity, while **rohanpaul_ai** discussed an AI Scientist system generating publishable ML research at low cost. **karpathy** warned of security risks in LLM tokenizers akin to SQL injection.

Canonical issue URL

AI News for 8/13/2024-8/14/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (253 channels, and 2414 messages) for you. Estimated reading time saved (at 200wpm): 294 minutes. You can now tag @smol_ai for AINews discussions!

The easier development to discuss is the ratification of the new GPT-4o model that was quietly released in ChatGPT last week. To be clear, this is DIFFERENT than the OTHER gpt-4o model released last week in API (the one we covered with structured outputs).

image.png

Approximately nobody is exactly happy about this - from the new naming structure, to the ever more creatively lowkey release, and even to the model performance - which is impressive - reclaiming the #1 spot on Lmsys arena from Gemini 1.5 Pro August.

New ChatGPT-4o Category Rankings:

The much cleaner story to tell is X.ai's Grok 2, which released at 11pm PT last night, and is revealed to be sus-column-r, which was NOT Cohere like many previously suspected. Grok 2 beats both Claude 3.5 Sonnet and GPT 4o May and Mini:

image.png

image.png

While Grok 1 (our coverage here)'s main feature was its open weights nature, Grok 2 is being released for premium subscribers in X, though the blogpost teases that both Grok-2 and Grok-2 mini will be released in X's new Enterprise API platform "later this month".

Grok 2 in X also integrates Black Forest Labs' comparatively uncensored Flux.1 (our coverage here) model, which has already superceded Stable Diffusion 3 in the open source text-to-image community (while Google's Imagen 3 edges toward more open with its new paper release).


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Updates and Capabilities

AI Development and Tools

Industry and Research Trends


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. New Open-Source LLM Releases: InternLM2.5

Theme 2. Advanced AI Agents with Desktop Control

Theme 3. Grok 2.0 Mini Surprises in LMSYS Arena

All AI Reddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Model Releases and Improvements

AI Development and Industry News

Community Moderation


AI Discord Recap

A summary of Summaries of Summaries by GPT4O (gpt-4o-2024-05-13)

1. LLM Model Advancements

2. Prompt Engineering Techniques

3. API Performance and Optimization

4. Open-Source AI Tools

5. Model Deployment and Integration

GPT4OMini (gpt-4o-mini-2024-07-18)

1. Grok-2 and Model Performance

2. Quantization Techniques and Model Merging

3. Open Source Tools and Community Contributions

4. AI Model Limitations and Improvements

5. AI Security and Ethical Considerations


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Nous Research AI Discord


Stability.ai (Stable Diffusion) Discord


OpenRouter (Alex Atallah) Discord


Modular (Mojo 🔥) Discord


OpenAI Discord


Perplexity AI Discord


LM Studio Discord


Latent Space Discord


Interconnects (Nathan Lambert) Discord


LlamaIndex Discord


Cohere Discord


LangChain AI Discord


Torchtune Discord


OpenAccess AI Collective (axolotl) Discord


LAION Discord


tinygrad (George Hotz) Discord


OpenInterpreter Discord


MLOps @Chipro Discord


DiscoResearch Discord


LLM Finetuning (Hamel + Dan) Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (213 messages🔥🔥):

  • Hermes 2
  • Mistral struggles
  • Model Merging
  • Open Empathic
  • HQQ+

Links mentioned:


Unsloth AI (Daniel Han) â–· #off-topic (6 messages):

  • Self-Promotion on Discord
  • Mosquitoes in the Wild

Link mentioned: латы Armor GIF - Латы Armor - Discover & Share GIFs: Click to view the GIF


Unsloth AI (Daniel Han) ▷ #help (75 messages🔥🔥):

  • Unsloth Inference
  • GPU vs CPU
  • Vram
  • Custom Datasets
  • Alpaca-Cleaned Dataset

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (20 messages🔥):

  • Model Card Typos
  • Model's Capabilities
  • Multi-Lingual LLM
  • Dataset-tool for RP
  • TheatreLM-v2.1-Characters

Links mentioned:


Unsloth AI (Daniel Han) â–· #research (3 messages):

  • Llama 3.1
  • Causal Mask
  • Causal Masking

Nous Research AI â–· #datasets (1 messages):

zhukov_80921: https://huggingface.co/datasets/bigcode/the-stack-v2 60tb of code


Nous Research AI â–· #off-topic (4 messages):

  • Grok-2
  • ComfyUI
  • Open Source AI

Links mentioned:


Nous Research AI â–· #interesting-links (2 messages):

  • Semantic Chunking
  • Regex Tokenization
  • Tokenizer API
  • Tiktoken Free Usage

Links mentioned:


Nous Research AI ▷ #general (155 messages🔥🔥):

  • Dataset Filtering and Scoring Tool
  • LMSYS Leaderboard
  • Grok-2
  • OpenAI ChatGPT-4o
  • HQQ+

Links mentioned:


Nous Research AI ▷ #ask-about-llms (22 messages🔥):

  • FP8 training
  • Nemotron
  • FP8 vs BF16 performance
  • Mosaic AI
  • Character.AI

Link mentioned: Turbocharged Training: Optimizing the Databricks Mosaic AI Stack With FP8: Benchmarking for training (dense) models at scale. We demonstrate great performance (very high MFU) and highlight our use of NVIDIA's Transformer Engine, along with PyTorch FSDP and DTensor.


Nous Research AI â–· #rag-dataset (1 messages):

.bexboy: Yep


Stability.ai (Stable Diffusion) ▷ #general-chat (134 messages🔥🔥):

  • AMD GPU
  • ControlNet
  • ComfyUI
  • SD3
  • Flux

Links mentioned:


OpenRouter (Alex Atallah) â–· #announcements (1 messages):

louisgv: ChatGPT-4o-latest is now available: https://openrouter.ai/models/openai/chatgpt-4o-latest


OpenRouter (Alex Atallah) ▷ #general (127 messages🔥🔥):

  • AgentQ
  • Infer
  • OpenRouter Pricing
  • ChatGPT-4o-Latest
  • Codeium

Links mentioned:


Modular (Mojo 🔥) ▷ #general (80 messages🔥🔥):

  • Mojo performance
  • Mojo benchmark
  • Rust vs C/C++
  • Mojo vs Go
  • Mojo threading

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo (32 messages🔥):

  • Mojo RPM Build
  • Mojo on RHEL machines
  • Magic CLI
  • Mojo as a Conda Package
  • Mojo Language Version Management

Link mentioned: MAX + Mojo Community Meetings #6: This is a video about MAX & Mojo Community Meetings #600:00 Introduction00:27 Small buffer and string optimizations13:04 DuckDB bindings in Mojo23:15 MAX an...


OpenAI ▷ #ai-discussions (88 messages🔥🔥):

  • Gemini Advanced
  • Gemini Live Talk
  • GPT-4o Advanced Voice Mode
  • Model Limitations
  • Prompt Engineering

Link mentioned: 124+ Most OVERUSED Words By ChatGPT In 2024: no description found


OpenAI â–· #gpt-4-discussions (4 messages):

  • Vision's performance
  • Image Deformations
  • Vision's limitations

OpenAI ▷ #prompt-engineering (8 messages🔥):

  • Critical Thinking Techniques
  • GPTs and Web Searching
  • Reclaiming Business Assets
  • Developer Mode Prompts

OpenAI ▷ #api-discussions (8 messages🔥):

  • Critical Thinking Techniques
  • Prompt Engineering
  • GPTs and Web Search
  • Business Asset Reclamation
  • Developer Mode

Perplexity AI ▷ #general (62 messages🔥🔥):

  • Perplexity Performance Issues
  • Perplexity Pro Lag
  • Sonnet vs Opus vs Perplexity
  • Perplexity's Website Update
  • Perplexity Support Team

Links mentioned:


Perplexity AI â–· #sharing (7 messages):

  • Shareable Threads
  • Radioactive Shoe Fitting
  • Perplexity Pro

Links mentioned:

Barter...: Certainly! Here's a professional and minimalist version of the text: Barter transactions, involving the exchange of goods or services without money, are...Articles about ia and climate change: Recent articles highlight the dual role of artificial intelligence (AI) in addressing climate change, emphasizing both its potential benefits and...Perplexity: Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question.Perplexity: Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question.


Perplexity AI ▷ #pplx-api (38 messages🔥):

  • Perplexity API response formatting
  • Function calling
  • JSONSchema7 validation
  • Model prompt engineering
  • Markdown to HTML conversion

LM Studio ▷ #general (71 messages🔥🔥):

  • Gemma-2
  • LM Studio on Android
  • LM Studio on external hard drive
  • LM Studio on Ubuntu
  • Multimodal LLMs

Links mentioned:


LM Studio ▷ #hardware-discussion (28 messages🔥):

  • GPU Copper Mod
  • GPU Bios Flashing
  • Text Classification Model Compatibility
  • GPU Offloading

Latent Space ▷ #ai-general-chat (79 messages🔥🔥):

  • MultiOn System Prompt Leak
  • AnswerAI ColBERT
  • Gemini Live Demo
  • GPT-4o Improvements
  • Grok 2

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (60 messages🔥🔥):

  • Grok-2
  • Anthropic API
  • Anthropic's Turnaround
  • DeepSeek
  • Llama

Links mentioned:


Interconnects (Nathan Lambert) â–· #ml-drama (7 messages):

  • GPT-4o improvements
  • ChatGPT API

Links mentioned:


Interconnects (Nathan Lambert) â–· #random (1 messages):

  • AI Copyright Discourse
  • Oligopoly
  • ACL2024NLP

Link mentioned: Tweet from Asad Sayeed @[email protected] (@asayeed): this is the end result of AI copyright discourse: oligopoly #ACL2024NLP


Interconnects (Nathan Lambert) â–· #posts (1 messages):

SnailBot News: <@&1216534966205284433>


LlamaIndex â–· #blog (3 messages):

  • LlamaIndex Box Reader
  • Relik Knowledge Graph
  • Azure AI Search RAG System

LlamaIndex ▷ #general (64 messages🔥🔥):

  • Inconsistent OpenAI Responses
  • Prompt Engineering
  • LlamaIndex
  • Chatbot Memory
  • GraphRAG

Links mentioned:


Cohere ▷ #discussions (17 messages🔥):

  • Grok 2
  • xAI
  • Cohere
  • OpenAI
  • Model Performance

Link mentioned: Tweet from Xuechen Li (@lxuechen): Have been post-training Grok2 for a while and am excited to share that it’s officially out!! We’ve been testing early versions of Grok2 on LMSYS chatbot arena under the names of sus-column-r and colu...


Cohere ▷ #questions (23 messages🔥):

  • Reranking Overview Document
  • Rerank API
  • Code Sample

Links mentioned:


Cohere ▷ #cohere-toolkit (10 messages🔥):

  • Cohere Toolkit Installation
  • Custom Deployment Issue
  • OpenAI Integration
  • Enterprise Search Chatbot
  • Fellowship.ai Cohort

LangChain AI ▷ #general (27 messages🔥):

  • LangChain support
  • LangSmith evaluation
  • LangGraph Cloud Access
  • LangChain Postgres Library
  • LLM Caching

Link mentioned: Model caches | 🦜️🔗 LangChain: This notebook covers how to cache results of individual LLM calls using different caches.


LangChain AI â–· #share-your-work (3 messages):

  • Rubik's AI
  • AI security
  • RedOps platform
  • Chatbot security
  • Voicebot security

Links mentioned:


LangChain AI â–· #tutorials (1 messages):

  • LangGraph
  • AI Agents
  • Email Management
  • Meeting Scheduling

Torchtune ▷ #dev (19 messages🔥):

  • Torchtune Compile Model+Loss
  • Torchtune CPU Offload Optimizer
  • Torchtune Model Size & Configuration

OpenAccess AI Collective (axolotl) ▷ #general (14 messages🔥):

  • Grok 2
  • Grok 2 mini
  • LMSYS
  • Claude
  • GPT-4

Links mentioned:


OpenAccess AI Collective (axolotl) â–· #axolotl-dev (1 messages):

  • axolotl model loading conditions
  • axolotl model loading

OpenAccess AI Collective (axolotl) â–· #general-help (3 messages):

  • OpenAI Chat Endpoint Limitations
  • Assistant Response Continuation

LAION ▷ #general (13 messages🔥):

  • Open-source image annotation GUIs
  • Elon Musk and weight licenses
  • Schnelle

LAION â–· #research (4 messages):

  • Grok-2 release
  • Grok-2 mini
  • Grok-2 performance
  • Grok-2 API
  • Grok-2 multimodality

Link mentioned: Grok-2 Beta Release: no description found


tinygrad (George Hotz) ▷ #general (10 messages🔥):

  • ConvTranspose2D
  • 3D data
  • kernel_size

tinygrad (George Hotz) â–· #learn-tinygrad (5 messages):

  • Tinygrad Error: wait_result: 10000 ms TIMEOUT!
  • Lazycache Issues
  • CLANG=1 issue

OpenInterpreter â–· #general (4 messages):

  • OpenInterpreter release
  • Local LLMs
  • RealtimeSTT
  • Faster-Whisper

OpenInterpreter â–· #O1 (3 messages):

  • Hardware Channel

OpenInterpreter â–· #ai-content (6 messages):

  • Open Interpreter
  • Tool Use Tuesday
  • Obsidian Plugin
  • Video Production
  • Manim

Links mentioned:


MLOps @Chipro ▷ #events (8 messages🔥):

  • Poe Hackathon
  • Modal labs
  • LLM fine-tuning

Links mentioned:


MLOps @Chipro â–· #general-ml (3 messages):

  • Image Feature Extraction
  • Preprocessing Time Reduction

DiscoResearch â–· #general (2 messages):

  • ``

LLM Finetuning (Hamel + Dan) â–· #general (1 messages):

  • Agentic AI Pipelines
  • Jupyter Notebook Automation
  • Devin-like System



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}