Frozen AI News archive

Gemma 2 tops /r/LocalLlama vibe check

**Gemma 2 (9B, 27B)** is highlighted as a top-performing local LLM, praised for its speed, multilingual capabilities, and efficiency on consumer GPUs like the 2080ti. It outperforms models like **Llama 3** and **Mistral 7B** in various tasks, including non-English text processing and reasoning. The community discussion on /r/LocalLlama reflects strong preference for Gemma 2, with **18 mentions**, compared to **10 mentions** for Llama 3 and **9 mentions** for Mistral. Other models like **Phi 3** and **Qwen** also received mentions but are considered surpassed by Gemma 2. Additionally, **Andrej Karpathy** announced the launch of **Eureka Labs**, an AI+Education startup aiming to create an AI-native school with AI Teaching Assistants, starting with the **LLM101n** course to teach AI training fundamentals. This initiative is seen as a significant development in AI education.

Canonical issue URL

AI News for 7/16/2024-7/17/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (468 channels, and 2051 messages) for you. Estimated reading time saved (at 200wpm): 232 minutes. You can now tag @smol_ai for AINews discussions!

Every few months, someone asks a vibe check question in /r/LocalLlama that takes off (March 2024, June 2024 and the official Models Megathread are the previous ones).

image.png

Recently a best models for their size? question is a chance to revisit the rankings. Last month's Gemma 2 (our coverage here) won handily, even without the 2B model:

Other positive mentions: DeepSeek, Cohere Command R, InternLLM, Yi 34B (Nous-Capybara version)

Meta note: We are now splitting out /r/localLlama in our Reddit recaps because of the tendency of the other subreddits to drown out technical discussion. Enjoy!


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

Andrej Karpathy's new AI+Education company Eureka Labs

New model releases

Discussions on model architectures and training data

Other notable updates


AI Reddit Recap

/r/LocalLlama

Theme 1. New Model Releases from Mistral AI and Apple

Theme 2. Llama 3 Performance and Limitations

Theme 3. Comparing Model Performance by Size

Theme 4. Debate on AI Hype vs. Long-term Potential

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

Comment crawling works now but has lots to improve!

Theme 1. Llama 3 Performance and Limitations

Theme 3. AI in Image and Video Generation

Theme 4. New AI Model Releases and Architectures

Theme 5. AI Regulation and Public Perception


AI Discord Recap

A summary of Summaries of Summaries

1. Advancements in AI Model Development and Deployment

2. Challenges and Innovations in AI Infrastructure

3. DeepSeek V2 Model Launch

4. New Multimodal Benchmarks


PART 1: High level Discord summaries

HuggingFace Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


Modular (Mojo 🔥) Discord


Nous Research AI Discord


Eleuther Discord


Stability.ai (Stable Diffusion) Discord


CUDA MODE Discord


Perplexity AI Discord


OpenRouter (Alex Atallah) Discord


LAION Discord


Interconnects (Nathan Lambert) Discord


Latent Space Discord


LlamaIndex Discord


Cohere Discord


OpenAI Discord


LangChain AI Discord


OpenAccess AI Collective (axolotl) Discord


OpenInterpreter Discord


Torchtune Discord


tinygrad (George Hotz) Discord


LLM Finetuning (Hamel + Dan) Discord


AI Stack Devs (Yoko Li) Discord


MLOps @Chipro Discord


AI21 Labs (Jamba) Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

HuggingFace ▷ #general (286 messages🔥🔥):

  • Model Tokenization
  • CUDA Errors
  • VRAM Management for Large Models
  • Math Data Annotation Needs
  • Live Translation with Transformers

Links mentioned:


HuggingFace ▷ #today-im-learning (3 messages):

  • SciPy tutorial
  • Audio course on Huggingface
  • Real-time kernels and Raspberry Pi

Link mentioned: Intro to Scipy ( by Rauf ): SciPy is another data manipulation and scientific calculation library, similar to NumPy, but with some differences. It's another tool in your toolkit, allowi...


HuggingFace ▷ #cool-finds (4 messages):

  • Nbeats and NBeatsX paper
  • 3D shape generation with deep learning
  • Time series forecasting
  • ML applications in 3D geometry

Links mentioned:


HuggingFace ▷ #i-made-this (24 messages🔥):

  • AI Vtuber testing
  • Chilean touristic data
  • Phi-3 Vision for Mac
  • ML for 3D model reduction
  • Fast Subtitle Maker

Links mentioned:


HuggingFace ▷ #reading-group (6 messages):

  • Learning by Implementing Papers
  • Inception Model and ResNet
  • Implicit Representation

HuggingFace ▷ #computer-vision (3 messages):

  • Skin Cancer Classification Model
  • VQModel Pre-trained Weights
  • Attention Extraction from GhostNetV2

Link mentioned: AI-Portfoilio/SkinCancerClassification_CNN/SkinCancerClassification.ipynb at master · Matthew-AI-Dev/AI-Portfoilio: Contribute to Matthew-AI-Dev/AI-Portfoilio development by creating an account on GitHub.


HuggingFace ▷ #NLP (3 messages):

  • System requirements for stability AI model
  • Prompt engineering for video generation
  • Stable Video Diffusion Image-to-Video Model

Link mentioned: stabilityai/stable-video-diffusion-img2vid-xt · Hugging Face: no description found


Unsloth AI (Daniel Han) ▷ #general (164 messages🔥🔥):

  • Unsloth AI beta testing with NDA
  • Floating license for multi-GPU support
  • Andrej Karpathy's new LLM101n course
  • LoRA adapter support in llama.cpp
  • Fine-tuning vs. RAG in Llama-3

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (10 messages🔥):

  • Codestral Mamba release
  • Mathstral release
  • Llama.cpp support issues
  • Google FlAMe 24B model
  • Llama 3 context detail

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (35 messages🔥):

  • Model Pre-training Issues
  • CUDA Compatibility for Unsloth
  • Fine-Tuning Challenges on Kaggle

Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

  • Ghost 8B Beta
  • Proprietary models: xAI Grok 1, OpenAI GPT 3.5, Mistral Mixtral 8x7B
  • Model evaluation: zero-shot method
  • Claude 2 and Claude 3
  • Playground with Ghost 8B Beta

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (50 messages🔥):

  • Memory Usage Optimization in Neural Networks
  • Discussion on AdamW-Mini Optimizer
  • Training Efficiency
  • Cost of Optimizer State
  • Strategies for Handling Multiple Tables in Excel

LM Studio ▷ #💬-general (83 messages🔥🔥):

  • Llama 3 8B on GPU
  • Mistral mamba code model
  • Troubleshooting Huge Text Files
  • Model Loading Issues
  • Codestral Mamba in LM Studio

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (107 messages🔥🔥):

  • META 3 7B Q8 instruct
  • LLava 3 testing
  • LLM suggestions for micro decisions
  • DeepSeek-Coder V2-Lite issues
  • Fine-tuning models locally

Links mentioned:


LM Studio ▷ #🧠-feedback (14 messages🔥):

  • Gemma 2 support
  • Phi 3 small support
  • Llama.cpp support
  • Error loading model
  • Smol-lm pre-tokenizer issue

LM Studio ▷ #📝-prompts-discussion-chat (1 messages):

pashtett: Any examples of optimal prompt and settings to gemma 2 for some RP chat based on story?


LM Studio ▷ #🎛-hardware-discussion (4 messages):

  • GPU craftsmanship
  • aesthetics importance
  • GPU power plug

LM Studio ▷ #model-announcements (1 messages):

  • Mathstral Announcement
  • STEM Specialization
  • GGUF Quantization

Link mentioned: lmstudio-community/mathstral-7B-v0.1-GGUF · Hugging Face: no description found


Modular (Mojo 🔥) ▷ #general (50 messages🔥):

  • Mojo Community Meeting
  • CFFI and C++ Interoperability
  • External Linking with DLOpen
  • Support Ticket System

Links mentioned:


Modular (Mojo 🔥) ▷ #ai (5 messages):

  • Object detection in videos
  • AWS EC2 instances
  • Mojo data types

Modular (Mojo 🔥) ▷ #mojo (51 messages🔥):

  • Mojo 🔥 Community Meeting
  • Mojo language keywords
  • Installing old Mojo versions
  • SIMD primitives references
  • Looping through Tuple in Mojo

Links mentioned:


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (3 messages):

  • Difference between parallelize and sync_parallelize
  • Memory management improvement

Modular (Mojo 🔥) ▷ #max (12 messages🔥):

  • Modular installation issues
  • MNIST accuracy discrepancy
  • User experience improvements
  • Verbose reporting for MAX

Modular (Mojo 🔥) ▷ #nightly (4 messages):

  • Inline functions in Mojo
  • SIMD optimization suggestions
  • New Mojo nightly release
  • Mojo nightly changelog updates

Modular (Mojo 🔥) ▷ #mojo-marathons (71 messages🔥🔥):

  • Mojo utilizing cores
  • NumPy performance
  • Benchmarking
  • BLAS backends
  • Intel MKL vs. other BLAS

Nous Research AI ▷ #research-papers (1 messages):

  • DataComp for Language Models (DCLM)
  • DCLM-Baseline-7B
  • MMLU Benchmark
  • OpenLM framework
  • Dataset design importance

Links mentioned:


Nous Research AI ▷ #datasets (2 messages):

  • Need for Math Annotators in AI
  • Replete-AI Multilingual Translation Dataset

Link mentioned: Replete-AI/Multi-lingual_Translation_Instruct · Datasets at Hugging Face: no description found


Nous Research AI ▷ #off-topic (4 messages):

  • Oxen.AI Paper Club
  • Representation Finetuning
  • Comparison to repeng/vector steering

Link mentioned: Oxen.ai · Events Calendar: View and subscribe to events from Oxen.ai on Luma. Build World-Class AI Datasets, Together. Track, iterate, collaborate on, & discover data in any format.


Nous Research AI ▷ #interesting-links (9 messages🔥):

  • Lunar Caves
  • Belief State Geometry in Transformers
  • Tool Use Models
  • LLM-driven Digital Agents

Links mentioned:


Nous Research AI ▷ #general (154 messages🔥🔥):

  • Hermes 2.5 vs Hermes 2 performance
  • Challenges extending Mistral
  • Model experimentation
  • Tool calling implementation
  • Function calling issues

Links mentioned:


Nous Research AI ▷ #ask-about-llms (19 messages🔥):

  • Tokenization with Tiktoken
  • Beam Search Implementation in Huggingface Pipelines
  • Invertibility of BPE in Tiktoken
  • Custom Sampling in Huggingface Pipelines

Nous Research AI ▷ #world-sim (3 messages):

  • ``

Eleuther ▷ #general (60 messages🔥🔥):

  • The Pile 2
  • Proof-Pile-2
  • YouTube video scraping controversy
  • Public reaction to YouTube data usage
  • Transparency in AI data usage

Links mentioned:


Eleuther ▷ #research (89 messages🔥🔥):

  • Efficient Attention mechanisms
  • Transformer optimizations
  • Reformer: The Efficient Transformer
  • LSH attention in practice
  • PLMs for immune escape

Links mentioned:


Eleuther ▷ #interpretability-general (4 messages):

  • Arrakis library
  • Mechanistic interpretability tools
  • Feedback request

Links mentioned:


Eleuther ▷ #lm-thunderdome (3 messages):

  • HF leaderboard musr score
  • Leaderboard maintainers query

Stability.ai (Stable Diffusion) ▷ #general-chat (143 messages🔥🔥):

  • Model Size & Hardware
  • Training Techniques
  • Prompting Nuances
  • Outpainting Techniques
  • Troubleshooting

Links mentioned:


CUDA MODE ▷ #general (26 messages🔥):

  • CUDA kernel call errors
  • Template types in CUDA
  • cudaMallocManaged overhead
  • Unified memory usage in CUDA
  • Deep learning specialization opinions

Links mentioned:


CUDA MODE ▷ #torch (33 messages🔥):

  • PyTorch Profiler Performance
  • Thunder vs Torch Compile
  • Nvfuser vs Triton
  • Kernel Compilation
  • Runtime Optimization

CUDA MODE ▷ #beginner (7 messages):

  • Mixing CUDA kernels with PyTorch
  • Writing custom CUDA kernels
  • Automatically generated Python bindings
  • Compiling custom kernels

CUDA MODE ▷ #torchao (13 messages🔥):

  • unwrap_tensor_subclass issues
  • AQT Tensor instance
  • FakeTensor attribute error
  • PyTorch nightly build
  • GitHub issue

Link mentioned: unwrap_tensor_subclass and nested tensor subclasses issue · Issue #515 · pytorch/ao: I'm noticing strange behavior when trying to create a tensor_subclass which holds another tensor_sub class. Here is a minified repro: (add this to the bottom of torchao/dtypes/affine_quantized_ten...


CUDA MODE ▷ #triton-puzzles (12 messages🔥):

  • Notation in Triton Puzzle 6
  • ImportError in Triton
  • Efficient Softmax Implementation
  • Assignment Operator in Triton

Link mentioned: Demystify OpenAI Triton: Learn how to build mapping from OpenAI Triton to CUDA for high-performance deep learning apps through step-by-step instructions and code examples.


CUDA MODE ▷ #llmdotc (3 messages):

  • RAG-GPT4 TA implementation at UIUC
  • Student interaction challenges
  • Optimizing GPT-2 kernels

CUDA MODE ▷ #huggingface (2 messages):

  • Improving ML Systems in HuggingFace Ecosystem

Perplexity AI ▷ #general (50 messages🔥):

  • Quality Control Issues
  • Captcha Implementation
  • Copying Code Blocks
  • API Rate Limits
  • Paid Subscription Credits

Link mentioned: Discover Typeform, where forms = fun: Create a beautiful, interactive form in minutes with no code. Get started for free.


Perplexity AI ▷ #sharing (11 messages🔥):

  • In-Batch Search
  • Moon's Hidden Refuge
  • Music Streaming Platforms
  • Best TV Shows May 2024
  • Best Resources for 12-year-olds

Link mentioned: YouTube: no description found


Perplexity AI ▷ #pplx-api (1 messages):

  • search_domain_filter
  • API Beta

OpenRouter (Alex Atallah) ▷ #general (60 messages🔥🔥):

  • Error Code 524
  • Meta 405B Model Pricing
  • Deepseek Coder Speed Issues
  • Fast and Affordable Models on OpenRouter
  • WordPress Plugin Issues

Links mentioned:


LAION ▷ #general (54 messages🔥):

  • Hacker attacks on LAION
  • ComfyUI malware
  • Disney hack data leak
  • Fake job candidates
  • Telecom failures after Hurricane Sandy

Links mentioned:


LAION ▷ #research (5 messages):

  • InternVL2-Llama3-76B
  • Manifold Research Group
  • LLM-based Autonomous Agents
  • Research Log #041
  • MultiNet Dataset

Links mentioned:


Interconnects (Nathan Lambert) ▷ #events (1 messages):

natolambert: Anyone at ICML? A vc friend of mine wants to meet my friends at a fancy dinner


Interconnects (Nathan Lambert) ▷ #news (8 messages🔥):

  • Prover-Verifier Games
  • Project Strawberry
  • Legibility Tax
  • RL and Model Properties

Interconnects (Nathan Lambert) ▷ #ml-questions (6 messages):

  • SmoLLM blog post
  • DPO dataset usage
  • Model family sizes

Interconnects (Nathan Lambert) ▷ #ml-drama (10 messages🔥):

  • Lobbying and vested interests
  • AI legislation polling issues
  • Public perception of AI tools

Link mentioned: Tweet from Matt Popovich (@mpopv): feels like if you are heavily lobbying for, and soliciting donations to lobby for, a certain legislative bill, you should probably disclose that you secretly own a company positioned to profit from th...


Interconnects (Nathan Lambert) ▷ #random (17 messages🔥):

  • GPT-4o vs Llama 405 tokenizers
  • GPT-4o tokenizer performance
  • Llama 3 initial release insights
  • Google Gemini 2.0 accident
  • Deepseek's open-source stance

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (3 messages):

  • Billboard AI in IPA
  • Dell's comeback
  • Mathstral MMLU Breakdown

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rlhf (2 messages):

  • Policy Loss Function Discussion
  • Degenerate Case in DPO-like Algos

Interconnects (Nathan Lambert) ▷ #reads (3 messages):

  • Sampling Methods in Policy Models
  • Preference Pair Selection
  • Zephyr Paper Criticism
  • Nemotron Paper Insights
  • DPO Objective Challenges

Interconnects (Nathan Lambert) ▷ #posts (1 messages):

SnailBot News: <@&1216534966205284433>


Latent Space ▷ #ai-general-chat (46 messages🔥):

  • Science HumanEval benchmark
  • SmolLM models
  • LangChain pain points
  • SF Compute fundraising
  • Exa AI Lab Series A

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

  • AI Agents That Matter
  • Latent Space Meetups
  • Calendar Integration

Link mentioned: LLM Paper Club (AI Agents That Matter) · Zoom · Luma: @shivdinho is leading us through AI Agents That Matter: https://arxiv.org/abs/2407.01502 For future weeks, we need YOU to volunteer to do rapid-fire recaps and…


LlamaIndex ▷ #blog (4 messages):

  • LlamaIndex Introduction
  • LlamaParse Improvements
  • Multi-Agent Tree System
  • AI Consulting Services
  • Scaleport AI Case Study

Links mentioned:


LlamaIndex ▷ #general (39 messages🔥):

  • Vector search with images
  • Metadata in ToolMetaData
  • Property graph issues in Neo4J
  • Query-time metadata filters
  • Troubleshooting CSV data with VectorStoreIndex

Links mentioned:


Cohere ▷ #general (35 messages🔥):

  • Amazon order discussion
  • CrunchCup product feedback
  • C4A Community Talks
  • Roger Grosse session
  • Recording of community events

Links mentioned:


OpenAI ▷ #ai-discussions (16 messages🔥):

  • Custom Chatbots and Fine-tuning
  • Moderation Models for Chatbots
  • Expensive Pricing for Detection Services
  • Voice Extraction from Podcasts

OpenAI ▷ #gpt-4-discussions (11 messages🔥):

  • GPTs Agents
  • Banning Issues
  • PUT Actions for Custom GPTs
  • Vector Store Embedding Issues
  • Exceeded API Quota

OpenAI ▷ #prompt-engineering (3 messages):

  • WebSurferAgent setup issues
  • Role-playing within ChatGPT

OpenAI ▷ #api-discussions (3 messages):

  • WebSurferAgent issues
  • Autogen
  • Internet search guidelines for technologies
  • Character roleplay templates

LangChain AI ▷ #general (23 messages🔥):

  • Viral video creation tools
  • LangChain document conversion
  • LangChain contributions
  • LangChain and Qdrant
  • Hybrid search with MongoDB

Links mentioned:


LangChain AI ▷ #share-your-work (2 messages):

  • Generative AI Assistants
  • MongoDB Hybrid Search Integration with LangChain

Link mentioned: Hannah: Aplicação que utiliza IA generativa para consultar os próprios documentos personalizados.


LangChain AI ▷ #tutorials (1 messages):

  • MongoDB as Vector Store
  • Hybrid Search with LangChain

OpenAccess AI Collective (axolotl) ▷ #general (15 messages🔥):

  • mistral mamba
  • Codestral Mamba release
  • Mathstral
  • Galore configs
  • ChatML

Link mentioned: Codestral Mamba: As a tribute to Cleopatra, whose glorious destiny ended in tragic snake circumstances, we are proud to release Codestral Mamba, a Mamba2 language model specialised in code generation, available under ...


OpenAccess AI Collective (axolotl) ▷ #general-help (6 messages):

  • Rank and Overfitting
  • Learning Rate and Overfitting
  • Overfitting Solutions

OpenInterpreter ▷ #general (8 messages🔥):

  • My Friend V1 initial feedback
  • AI Friend's transcriptions privacy with Open Interpreter
  • FRIEND + OI potential collaboration
  • Open Interpreter compatibility with M3 Mac
  • Task allocation and collaboration in roadmap

Link mentioned: Tweet from JediCat (@ParallaxAngle): @kodjima33 First impression of My Friend, V1 :: in voice of Her :: "Nik, it's smaller than I expected." LOVE LOVE LOVE my Friend. Congrats to you and the Based Hardware team. Can&#39...


OpenInterpreter ▷ #O1 (4 messages):

  • Receiving 01 hardware
  • Usage instructions for 01
  • Relation between Open Interpreter and 01

Torchtune ▷ #announcements (1 messages):

  • Torchtune v0.2.0 release
  • New models and recipes
  • Sample packing
  • Community contributions

Link mentioned: Release v0.2.0 · pytorch/torchtune: Overview It’s been awhile since we’ve done a release and we have a ton of cool, new features in the torchtune library including distributed QLoRA support, new models, sample packing, and more! Chec...


Torchtune ▷ #dev (10 messages🔥):

  • LLAMA 3 finetuning issues
  • Torchtune nightly installations
  • Stable Torchtune version

Link mentioned: no title found: no description found


tinygrad (George Hotz) ▷ #general (1 messages):

terafo: It's available now


tinygrad (George Hotz) ▷ #learn-tinygrad (2 messages):

  • Notes on removal of linearizer
  • Message format clarification

LLM Finetuning (Hamel + Dan) ▷ #general (2 messages):

  • OpenAI access requirements
  • LLM rule checker for hospital bills
  • Python code for billing rules

LLM Finetuning (Hamel + Dan) ▷ #hugging-face (1 messages):

  • Channel Activity
  • User Engagement

AI Stack Devs (Yoko Li) ▷ #team-up (1 messages):

  • Developer Opportunities
  • HLS and WebRTC
  • Backend Development
  • TypeScript
  • MongoDB

Link mentioned: no title found: no description found


MLOps @Chipro ▷ #events (1 messages):

  • Phoenix 2.0
  • OSS Discussion
  • New Features in Phoenix
  • Arize Product Stack

Link mentioned: Phoenix 2.0 Launch Week Town Hall:   July 18th, 2024   10:00am PST – 11:00am PST Virtual Come join us as we cap off the Launch Week of Phoenix 2.0. In this town hall, we’ll cover...


AI21 Labs (Jamba) ▷ #announcements (1 messages):

  • Python SDK updates
  • Async client support
  • Jamba-Instruct examples

Link mentioned: GitHub - AI21Labs/ai21-python: AI21 Python SDK: AI21 Python SDK. Contribute to AI21Labs/ai21-python development by creating an account on GitHub.





{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}