Frozen AI News archive

not much happened today

**OpenAI** launched **GPT-4o finetuning** with a case study on Cosine. **Anthropic** released **Claude 3.5 Sonnet** with 8k token output. **Microsoft Phi** team introduced **Phi-3.5** in three variants: Mini (3.8B), MoE (16x3.8B), and Vision (4.2B), noted for sample efficiency. **Meta** released **Llama 3.1 405B**, deployable on Google Cloud Vertex AI, offering GPT-4 level capabilities. **Qwen2-Math-72B** achieved state-of-the-art math benchmark performance with a Gradio demo. Discussions included model comparisons like ViT vs CNN and Mamba architecture. Tools updates featured **DSPy** roadmap, **Flux Schnell** improving diffusion speed on M1 Max, and **LangChain** community events. Research highlights zero-shot DUP prompting for math reasoning and fine-tuning best practices. AI ethics covered California's AI Safety Bill SB 1047 and regulatory concerns from **Yann LeCun**. Commentary on AI engineer roles by **Swyx**. *"Chat with PDF"* feature now available for Box Enterprise Plus users.

Canonical issue URL

AI News for 8/19/2024-8/20/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (254 channels, and 2227 messages) for you. Estimated reading time saved (at 200wpm): 258 minutes. You can now tag @smol_ai for AINews discussions!

No main story, just little ones:

Since it's a quiet day you can support AINews by checking out Box AI who have kindly supported this week's issues!


[Sponsored by Box] You might have an app. It might have users. Those users might even store docs in Box. But Box AI lets your users query their docs right in the Content Preview UI Element!

Swyx commentary: "Chat with PDF" is now one React component and an API key away! Note it's only available to Box Enterprise Plus customers for now.

(previously with Box AI: Week 1, Week 2)


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Model Developments and Benchmarks

AI Tools and Applications

AI Research and Techniques

AI Ethics and Regulation

AI Engineering Perspectives


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Large Language Model Releases and Deployment

Theme 2. Innovative AI Interfaces: Handwriting and Speech Recognition

All AI Reddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Image Generation Advancements

AI Industry Developments

AI Ethics and Philosophy Discussions

Memes and Humor


AI Discord Recap

A summary of Summaries of Summaries by Claude 3.5 Sonnet

1. LLM Advancements and Benchmarking

2. Model Performance Optimization

3. Open-Source AI Developments

4. Multimodal AI and Vision Models

5. LLM Training and Fine-tuning Techniques


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


CUDA MODE Discord


Nous Research AI Discord


Cohere Discord


OpenRouter (Alex Atallah) Discord


LM Studio Discord


Eleuther Discord


Perplexity AI Discord


LlamaIndex Discord


Latent Space Discord


Modular (Mojo 🔥) Discord


OpenAI Discord


Torchtune Discord


LangChain AI Discord


OpenInterpreter Discord


DSPy Discord


LAION Discord


DiscoResearch Discord


Interconnects (Nathan Lambert) Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (91 messages🔥🔥):

  • Fine-tuning Llama-3.1-405B
  • Unsloth limitations
  • Hugging Face Space GPU
  • Free model access
  • Training Loss

Unsloth AI (Daniel Han) ▷ #off-topic (103 messages🔥🔥):

  • Perplexity (PPL)
  • Model Fine-tuning
  • Javascript AST Walking

Unsloth AI (Daniel Han) ▷ #help (12 messages🔥):

  • Llama 3.1 Fine-Tuning
  • WSL Anaconda Installation
  • Mini Conda

Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

etherl: <@488399737884639242> no self promotions please <:slothhug:1257540335438008343>


Unsloth AI (Daniel Han) ▷ #research (3 messages):

  • Lottery Ticket Adaptation
  • LoRAs
  • Finetuning
  • Catastrophic Forgetting

CUDA MODE ▷ #general (4 messages):

  • Llama Model on Mac

CUDA MODE ▷ #triton (8 messages🔥):

  • Triton Error
  • Triton kernel optimization
  • constexpr type

CUDA MODE ▷ #torch (73 messages🔥🔥):

  • Comfy FP16
  • FP8 Matmul
  • Stable-fast
  • Oneflow/Onediff
  • Flux

Link mentioned: hqq/hqq/kernels/hqq_aten_cuda_kernel.cu at master · mobiusml/hqq: Official implementation of Half-Quadratic Quantization (HQQ) - mobiusml/hqq


CUDA MODE ▷ #cool-links (2 messages):

  • CUDA matrix transpose
  • CUTLASS tutorial
  • 4090D with 48GB
  • FP8 support
  • bf16 testing

Links mentioned:


CUDA MODE ▷ #jobs (1 messages):

  • Krish's Skillset
  • Krish's Job Search
  • Krish's Experience

Link mentioned: Krish_Rewanth_Resume.pdf: no description found


CUDA MODE ▷ #beginner (9 messages🔥):

  • CUDA Setup for VS Code
  • PyTorch CUDA errors
  • C++ CUDA code issues
  • VS Code C_CPP_Properties.json
  • PyTorch Cpp Extension

Link mentioned: Nsight Visual Studio Code Edition: CUDA development for NVIDIA platforms integrated into Microsoft Visual Studio Code


CUDA MODE ▷ #youtube-recordings (1 messages):

  • Composable Kernel
  • ROCm
  • Tile Programs
  • GPU Computing

Link mentioned: composable_kernel/example/91_tile_program at ck_tile_toy · ROCm/composable_kernel: Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators - ROCm/composable_kernel


CUDA MODE ▷ #torchao (81 messages🔥🔥):

  • Llama2 eval crashing
  • GPT-Fast eval script
  • HF_eval script
  • OOM for Llama2
  • Int4wo vs bf16 performance

Links mentioned:


CUDA MODE ▷ #llmdotc (3 messages):

  • H100 L2 cache optimization
  • memcpy optimization
  • cuMemAddressReserve
  • deterministic memory allocation

CUDA MODE ▷ #cudamode-irl (1 messages):

evil666man: Would love to collaborate here!


Nous Research AI ▷ #off-topic (6 messages):

  • Hermes 3/405
  • Llama 3.1-instruct-405
  • Meta-Llama 405b
  • LLM Arena
  • Hermes 3 Launch

Link mentioned: LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys: no description found


Nous Research AI ▷ #interesting-links (2 messages):

  • pydantic-xml extension
  • Nous Aesthetics

Link mentioned: pydantic-xml: no description found


Nous Research AI ▷ #general (93 messages🔥🔥):

  • Llama 3.1 Minitron 4B Width Base
  • Hermes 3 Image Generation
  • Hermes 3 Amnesia Mode
  • Hermes 3 405B Function Calling
  • Hermes 3 Performance Differences

Link mentioned: nvidia/Llama-3.1-Minitron-4B-Width-Base · Hugging Face: no description found


Nous Research AI ▷ #ask-about-llms (7 messages):

  • Roleplay benchmark
  • Knowledge cutoff updates
  • Model pretraining

Cohere ▷ #discussions (78 messages🔥🔥):

  • OPRO Paper
  • C4AI Discord Invite
  • Cohere API Response Format
  • Cohere Classify Sunset
  • Reranker API on 10k Docs

Link mentioned: Cohere - Research Scholar: Why this role? Cohere For AI is the dedicated research arm of Cohere. The Cohere For AI research lab seeks to solve complex machine learning problems by supporting fundamental research that explores t...


Cohere ▷ #questions (16 messages🔥):

  • Cohere Sponsorship
  • PDF Abstract Extraction
  • Cohere API SSL Verification
  • Cohere API LangChain
  • Freelance Developer Team

OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

  • Hermes 3

Link mentioned: Hermes 3 70B Instruct - API, Providers, Stats: Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, rea...


OpenRouter (Alex Atallah) ▷ #general (84 messages🔥🔥):

  • GPT Functions
  • OpenRouter Model Support
  • German Pretraining
  • Mistral
  • Multilingual Models

Links mentioned:


LM Studio ▷ #general (45 messages🔥):

  • Uncensored models
  • LM Studio server issues
  • Llama 3.1
  • Speech to Text and Text to Speech
  • Vision models

Links mentioned:


LM Studio ▷ #hardware-discussion (36 messages🔥):

  • M2 Ultra
  • GPU performance
  • Nvidia 4090 vs 4060 Ti
  • Nvidia 48GB Card
  • LLM speed

Links mentioned:


Eleuther ▷ #general (6 messages):

  • GPT-4 Neuron Explanations
  • BlackboxNLP Paper
  • Language Models Explain Neurons

Eleuther ▷ #research (48 messages🔥):

  • Model Training
  • MiniPile Dataset
  • Frankenmerging
  • Model Merging
  • KANs

Links mentioned:


Eleuther ▷ #scaling-laws (4 messages):

  • Chinchilla vs Gopher Data Filtering

Eleuther ▷ #lm-thunderdome (21 messages🔥):

  • Llama 3.1 System Prompt
  • Llama Eval Chat Template
  • Huggingface Chat Template
  • System Prompt in Huggingface
  • YAML Parameters

Cutting Knowledge Date: December 2023 Today Date: 26 Jul 2024


Perplexity AI ▷ #general (56 messages🔥🔥):

  • Perplexity Pro Discord
  • Perplexity Pro Features
  • Perplexity Pro Search Bugs
  • Perplexity AI Models
  • Perplexity's Future

Links mentioned:


Perplexity AI ▷ #sharing (5 messages):

  • Perplexity Pro
  • LMSYS Arena
  • G1 Humanoid Robot

Links mentioned:


Perplexity AI ▷ #pplx-api (4 messages):

  • Camera quality
  • Discord issues

LlamaIndex ▷ #blog (3 messages):

  • LlamaIndex
  • RAG
  • Retrieval-Augmented Generation
  • LLMs in Production
  • Amazon Neptune

LlamaIndex ▷ #general (36 messages🔥):

  • LlamaIndex Hierarchical Node Parser
  • LlamaIndex Retrieval
  • LlamaIndex ChromaDB Vector Store
  • Rag Application with LlamaIndex
  • Connecting LlamaIndex to Private LLMs

Links mentioned:


Latent Space ▷ #ai-general-chat (22 messages🔥):

  • Latent Space podcast
  • Encoder-Decoder models
  • Fast.html
  • Saving state/Updating state
  • Fine-tuning vs RAG vs KV caching

Links mentioned:


Modular (Mojo 🔥) ▷ #general (8 messages🔥):

  • Mojo & MAX Update Cadence
  • Siamese Networks with Labels
  • Slice Custom Op Usage

Modular (Mojo 🔥) ▷ #mojo (12 messages🔥):

  • Mojo's List implementation
  • Mojo's ref keyword
  • Mojo's __lifetime_of function
  • AI Chip Performance
  • Network on Chip (NoC)

Modular (Mojo 🔥) ▷ #max (2 messages):

  • Modular installation issues
  • Modular Manifest Error
  • Modular Expiration

OpenAI ▷ #ai-discussions (19 messages🔥):

  • ChatGPT capabilities
  • AI Enthusiasm
  • Grok2
  • Smart Cookbook
  • Strawberry Release

OpenAI ▷ #prompt-engineering (1 messages):

  • Structured Output
  • JSON Output
  • Model Performance
  • Prompt Engineering

OpenAI ▷ #api-discussions (1 messages):

  • structured output
  • JSON mode

Torchtune ▷ #general (1 messages):

wiiiktor.: When do you plan to release it, if I may ask? I see it's 99% ready.


Torchtune ▷ #dev (19 messages🔥):

  • torch.compile recompilations
  • torch.compile optimization
  • kv-cache for generation
  • rng generator object in torch.compile
  • torch.compile and custom masks

LangChain AI ▷ #general (18 messages🔥):

  • LLaMA 3.1 70B for SQL
  • Mistral 8k Limits
  • Model Merging Tactics
  • Open Empathic Project
  • LangChain SQLDatabaseChain

Link mentioned: no title found: no description found


OpenInterpreter ▷ #general (16 messages🔥):

  • Accessibility Roundtable
  • Deepseek API
  • OI with Ollama
  • Poetry and Pytorch on Mac
  • Ollama on a different machine

Links mentioned:


OpenInterpreter ▷ #O1 (1 messages):

  • OpenInterpreter Update

OpenInterpreter ▷ #ai-content (1 messages):

notnaton: Latest episode from Tool Use 🚀 : https://www.youtube.com/watch?v=uAo513GIwoU


DSPy ▷ #show-and-tell (2 messages):

  • dspy-ai installation
  • ADAS and Function Calling
  • pickle5 compatibility
  • Python version

DSPy ▷ #general (9 messages🔥):

  • DSPy Finetuning
  • DSPy vs. Langchain/LLamaindex
  • Aider v0.51.0 Changelog
  • Providing Feedback to DSPy Documentation
  • Multi-Lora Setting

Links mentioned:


DSPy ▷ #colbert (1 messages):

  • Late Interaction Models
  • Dense Embedding Models
  • Qdrant 1.10
  • ColBERT

Link mentioned: Any* Embedding Model Can Become a Late Interaction Model - If You Give It a Chance! - Qdrant: We discovered something interesting. Standard dense embedding models can perform surprisingly well in late interaction scenarios.


LAION ▷ #general (1 messages):

  • LTXStudio new features

Link mentioned: Tweet from LTX Studio (@LTXStudio): 🎉 The wait is over 🎉 To celebrate we're launching FIVE new features to take your projects to the next level. Try them out yourself now 🔥


LAION ▷ #research (5 messages):

  • JPEG Encoding for Images
  • AR-Based Image Tokenization
  • VQ-VAE
  • Image Compression Limits
  • H.265/AV1 for Training

DiscoResearch ▷ #general (1 messages):

  • GPT4All-Community
  • Leo models
  • Hugging Face
  • Model Card

Interconnects (Nathan Lambert) ▷ #memes (1 messages):

xeophon.: https://x.com/bilawalsidhu/status/1825548322687574410?s=46






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}