Frozen AI News archive

Claude 3 just destroyed GPT 4 (see for yourself)

**Claude 3** from **Anthropic** launches in three sizes: Haiku (small, unreleased), Sonnet (medium, default on claude.ai, AWS, and GCP), and Opus (large, on Claude Pro). Opus outperforms **GPT-4** on key benchmarks like GPQA, impressing benchmark authors. All models support **multimodality** with advanced vision capabilities, including converting a 2-hour video into a blog post. Claude 3 offers improved alignment, fewer refusals, and extended context length up to **1 million tokens** with near-perfect recall. Haiku is noted for speed and cost-efficiency, processing dense research papers in under three seconds. The models excel at following complex instructions and producing structured outputs like JSON. Safety improvements reduce refusal rates, though some criticism remains from experts. Claude 3 is trained on synthetic data and shows strong domain-specific evaluation results in finance, medicine, and philosophy.

Canonical issue URL

Claude 3 is here! Nothing else from the weekend matters in comparison, which is awfully nice for weekday newsletter writers.

image.png

TLDR:

Our full notes below:

As a bonus, Noah did 2 runs of Claude 3 (Sonnet) vs GPT4 on the same Twitter data scrapes you see below. We think Claude 3's summarization capabilities are way, way better.


Table of Contents

[TOC]


PART X: AI Twitter

Compare Claude 3 vs GPT4T

AI Progress and Capabilities

AI Investments and Business

AI Safety and Regulation

Memes and Humor

Other Relevant Tweets for AI Engineers


PART 0: Summary of Summaries of Summaries

This is now also driven by Claude 3, which is way better than OpenAI's output.


PART 1: High level Discord summaries

TheBloke Discord Summary


OpenAI Discord Summary


Perplexity AI Discord Summary


Mistral Discord Summary

Hermes 2.5 Takes the Lead: Discussions in the guild revealed that Hermes 2.5 has unexpectedly outperformed Hermes 2 in various benchmarks, with specific reference to the MMLU benchmark performance - a significant point for those considering upgrades or new deployments.

Mistral Deployment and Configuration Insights: Engineers seeking optimal configurations for Mistral deployment gathered valuable advice, with best practices discussed for a dual NVIDIA 3090 setup, VRAM requirements for fp16 precision (~90GB), and quantization strategies. Curious eyes were also pointed towards "thebloke"'s Discord for additional community support.

Benchmarks Resonating With Personal Experience: A significant number of posts revolved around performance benchmarks and personal experiences with different models. Particularly intriguing was the reported superiority of Mistral Large over GPT-4 for coding tasks, challenging official tests and signaling the need for user-specific benchmarks.

Discussions Hover Around Model Limitations: Technical dialogues converged on the inherent limitations of models such as Mistral and Mixtral, specifically discussing the context size constraints with a 32k token limit for Mistral-7B-Instruct-v0.2, and sliding window functionality issues leading to possible performance degradation.

Fine Tuning and Usage Nuances Explored: Users shared insights on successfully leveraging models for specific tasks, such as sentiment analysis and scientific reasoning. However, concerns about Mixtral's training implementation and requests for a minimalistic guide suggest a demand for clearer documentation within the community.

Emerging AI Tools and Competitive Landscape: Enthusiasts and practitioners alike have turned their attention to emerging AI tools, including Kubernetes AI tooling and Anthropic's release of Claude-3, sparking discussions on competitive offerings and the importance of open weights for AI models.


Nous Research AI Discord Summary


Eleuther Discord Summary


LM Studio Discord Summary


HuggingFace Discord Summary


LAION Discord Summary


CUDA MODE Discord Summary

Swap Space on Speed Dial: Discussion centered on using Linux VRAM as swap space with potential speed advantages over traditional disk paging, although possible demand conflicts were noted. Resources like vramfs on GitHub and ArchLinux documentation were shared.

Rapid Verification and Chat Retrievals: Users sought assistance on accessing previous day's live chat discussions and queried about Gmail verification times on lightning.ai, highlighting quick resolution times and the ease of accessing recorded sessions.

CUDA Conundrums and Triton Tweaks: Engineers shared insights into CUDA programming difficulties, examining Triton's relationship to NVCC and asynchronous matrix multiplication in Hopper architecture. Resources such as the unsloth repository and the Triton GitHub page were highlighted.

GPU-Powered Databases: The idea of running databases on GPUs gained traction, with mentions of the cuDF library and reference to a ZDNet article on GPU databases.

Mistral's Computation Contemplations: Debates arose over Mistral's computing capabilities, questioning the adequacy of 1.5k H100 GPUs for large-scale model training and discussing asynchronous operations. Links included NVIDIA's cuBLASDx documentation and a tweet from Arthur Mensch.

PyTorch Developer Podcast Drops New Episode: The podcast's episode discussing AoTInductor was shared, echoing community enthusiasm for the series.

Ring Attention Rings a Bell: Ring and Striped Attention were hot topics, with references to discussions on the YK Discord and a Together.ai blog post. Various code bases like ring-flash-attention and flash-attention provided implementation insights.

CUDA-MODE Lecture Loaded: Announcement of Lecture 8 on CUDA performance gotchas with a promise of tricks for maximizing occupancy and minimizing issues, set to start promptly for eager learners.

Career Cornerstones: Job postings by Lamini AI and Quadrature aimed at HPC and GPU Optimization Engineers, highlighting opportunities to work on exciting projects such as optimizing LLMs on AMD GPUs and AI workloads in global financial markets. Details can be found on Lamini AI Careers and Quadrature Careers.

Lecture 8 Redux on YouTube: After technical issues with a prior recording, Lecture 8, titled CUDA Performance Checklist, was re-recorded and shared along with corresponding code samples and slides.


LlamaIndex Discord Summary


OpenRouter (Alex Atallah) Discord Summary

Links mentioned:


LLM Perf Enthusiasts AI Discord Summary


Interconnects (Nathan Lambert) Discord Summary


LangChain AI Discord Summary


Latent Space Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary


DiscoResearch Discord Summary


Datasette - LLM (@SimonW) Discord Summary


Alignment Lab AI Discord Summary


Skunkworks AI Discord Summary


AI Engineer Foundation Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (994 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (379 messages🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (39 messages🔥):

Links mentioned:


TheBloke ▷ #model-merging (1 messages):


TheBloke ▷ #coding (11 messages🔥):

Links mentioned:


OpenAI ▷ #ai-discussions (128 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (38 messages🔥):


OpenAI ▷ #prompt-engineering (506 messages🔥🔥🔥):

Links mentioned:


OpenAI ▷ #api-discussions (506 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #general (618 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (20 messages🔥):


Perplexity AI ▷ #pplx-api (27 messages🔥):

Links mentioned:

Perplexity Blog): Explore Perplexity's blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.


Mistral ▷ #general (213 messages🔥🔥):

Links mentioned:


Mistral ▷ #models (79 messages🔥🔥):

Links mentioned:

LLM Visualization: no description found


Mistral ▷ #deployment (17 messages🔥):


Mistral ▷ #ref-implem (1 messages):


Mistral ▷ #finetuning (1 messages):


Mistral ▷ #showcase (3 messages):

Links mentioned:


Mistral ▷ #random (13 messages🔥):

Links mentioned:

GitHub - treebeardtech/terraform-helm-kubeflow: Kubeflow Terraform Modules - run Jupyter in Kubernetes 🪐: Kubeflow Terraform Modules - run Jupyter in Kubernetes 🪐 - treebeardtech/terraform-helm-kubeflow


Mistral ▷ #la-plateforme (82 messages🔥🔥):

Links mentioned:


Mistral ▷ #le-chat (126 messages🔥🔥):

Links mentioned:


Mistral ▷ #failed-prompts (13 messages🔥):


Nous Research AI ▷ #ctx-length-research (5 messages):

Links mentioned:


Nous Research AI ▷ #off-topic (31 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (49 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (328 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (32 messages🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (2 messages):

Links mentioned:

GitHub - vikhyat/moondream: tiny vision language model: tiny vision language model. Contribute to vikhyat/moondream development by creating an account on GitHub.


Eleuther ▷ #general (197 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (115 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (1 messages):


Eleuther ▷ #interpretability-general (50 messages🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (71 messages🔥🔥):

Links mentioned:


Eleuther ▷ #multimodal-general (1 messages):

besiktas: havent really seen anything and have wondered/experimented this as well


Eleuther ▷ #gpt-neox-dev (2 messages):

Links mentioned:

the-pile/processing_scripts at master · EleutherAI/the-pile: Contribute to EleutherAI/the-pile development by creating an account on GitHub.


LM Studio ▷ #💬-general (155 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (49 messages🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (5 messages):


LM Studio ▷ #🎛-hardware-discussion (114 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (2 messages):

Links mentioned:


LM Studio ▷ #autogen (4 messages):

Links mentioned:


LM Studio ▷ #memgpt (1 messages):

triffed.: <@1211375065191682131> it exists i'm on arch i just used yay to get it


LM Studio ▷ #avx-beta (1 messages):

.tntflo: Can we get this for linux too


LM Studio ▷ #crew-ai (5 messages):


HuggingFace ▷ #general (121 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (5 messages):


HuggingFace ▷ #cool-finds (7 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (8 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (67 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #core-announcements (1 messages):

Links mentioned:

Support EDM-style training in DreamBooth LoRA SDXL script by sayakpaul · Pull Request #7126 · huggingface/diffusers: Command example: CUDA_VISIBLE_DEVICES=1 accelerate launch train_dreambooth_lora_sdxl.py \ --pretrained_model_name_or_path="playgroundai/playground-v2.5-1024px-aesthetic" \ --instance_da...


HuggingFace ▷ #diffusion-discussions (21 messages🔥):

Links mentioned:


HuggingFace ▷ #computer-vision (7 messages):


HuggingFace ▷ #NLP (15 messages🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (21 messages🔥):

Links mentioned:


LAION ▷ #general (238 messages🔥🔥):

Links mentioned:


LAION ▷ #research (11 messages🔥):

Links mentioned:

Reddit - Dive into anything: no description found


LAION ▷ #learning-ml (1 messages):


CUDA MODE ▷ #general (21 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (11 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (113 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #torch (3 messages):

Links mentioned:

no title found: no description found


CUDA MODE ▷ #announcements (1 messages):


CUDA MODE ▷ #suggestions (5 messages):

Links mentioned:


CUDA MODE ▷ #jobs (4 messages):

Links mentioned:


CUDA MODE ▷ #beginner (11 messages🔥):

Links mentioned:


CUDA MODE ▷ #youtube-recordings (5 messages):

Links mentioned:

Lecture 8: CUDA Performance Checklist: Code https://github.com/cuda-mode/lectures/tree/main/lecture8Slides https://docs.google.com/presentation/d/1cvVpf3ChFFiY4Kf25S4e4sPY6Y5uRUO-X-A4nJ7IhFE/edit


CUDA MODE ▷ #ring-attention (53 messages🔥):

Links mentioned:


LlamaIndex ▷ #blog (5 messages):

Links mentioned:

ADU Planner: Revolutionize the ADU construction process with our GAI-powered ADU planner, a brand new solution to provide effortless design, local compliance, and quick supplier connections in one click.


LlamaIndex ▷ #general (178 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

Links mentioned:

Empowering Long Context RAG: The Integration of LlamaIndex with LongContext: Ankush k Singal


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


OpenRouter (Alex Atallah) ▷ #general (96 messages🔥🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #general (1 messages):


LLM Perf Enthusiasts AI ▷ #claude (71 messages🔥🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #embeddings (10 messages🔥):


Interconnects (Nathan Lambert) ▷ #news (43 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (6 messages):


Interconnects (Nathan Lambert) ▷ #random (24 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (1 messages):

natolambert: TBT this was the best meme day


Interconnects (Nathan Lambert) ▷ #rl (5 messages):


LangChain AI ▷ #general (56 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #langserve (5 messages):

Links mentioned:

21 YEARS TOGETHER Get a $50 gift card!: Steam is the ultimate destination for playing, discussing, and creating games.


LangChain AI ▷ #langchain-templates (2 messages):


LangChain AI ▷ #share-your-work (9 messages🔥):

Links mentioned:


LangChain AI ▷ #tutorials (3 messages):

Links mentioned:


Latent Space ▷ #ai-general-chat (51 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (22 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (12 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #general-help (6 messages):

Links mentioned:

Hyperparameter optimization CLI · Issue #1356 · OpenAccess-AI-Collective/axolotl: ⚠️ Please check that this feature request hasn't been suggested before. I searched previous Ideas in Discussions didn't find any similar feature requests. I searched previous Issues didn't...


OpenAccess AI Collective (axolotl) ▷ #community-showcase (4 messages):


DiscoResearch ▷ #disco_judge (2 messages):

Links mentioned:

GitHub - StonyBrookNLP/tellmewhy: Website for release of TellMeWhy dataset for why question answering: Website for release of TellMeWhy dataset for why question answering - StonyBrookNLP/tellmewhy


DiscoResearch ▷ #general (8 messages🔥):

Links mentioned:

AI in Production - AI strategy and tactics.: no description found


DiscoResearch ▷ #benchmark_dev (3 messages):

Links mentioned:

GitHub - mayflower/FastEval: Fast & more realistic evaluation of chat language models. Includes leaderboard.: Fast & more realistic evaluation of chat language models. Includes leaderboard. - mayflower/FastEval


DiscoResearch ▷ #discolm_german (18 messages🔥):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #ai (1 messages):

dbreunig: This demo of stable diffusion xl lightning is blowing my mind: https://fastsdxl.ai/


Datasette - LLM (@SimonW) ▷ #llm (4 messages):

Links mentioned:

GitHub - simonw/llm-claude-3: LLM plugin for interacting with the Claude 3 family of models: LLM plugin for interacting with the Claude 3 family of models - simonw/llm-claude-3


Alignment Lab AI ▷ #looking-for-collabs (2 messages):


Alignment Lab AI ▷ #general-chat (1 messages):

Links mentioned:

AI in Production - AI strategy and tactics.: no description found


Skunkworks AI ▷ #general (3 messages):

Links mentioned:

AI in Production - AI strategy and tactics.: no description found


AI Engineer Foundation ▷ #general (3 messages):