Frozen AI News archive

Not much happened today.

**Meta** introduced **Meta 3D Gen**, a system for end-to-end generation of 3D assets from text in under 1 minute, producing high-quality 3D assets with detailed textures. **Perplexity AI** updated Pro Search to handle deeper research with multi-step reasoning and code execution. **Microsoft** improved **Phi-3 Mini** with better long-context understanding and instruction following. **GPT4All 3.0** launched with support for thousands of models and major OS compatibility, featuring local file chat. **Yi-Large** model launched on Fireworks AI Playground. Research highlights include the evolution of **reinforcement learning from human feedback (RLHF)**, persona-driven data synthesis using a billion diverse personas, meta-tuning for few-shot generalization, and steering vectors for model behavior control. Tools updates include **LangSmith** improving memory retrieval and **Qdrant Engine v1.10** adding universal query API and multivector search.

Canonical issue URL

AI News for 7/2/2024-7/3/2024. We checked 7 subreddits, 384 Twitters and 30 Discords (418 channels, and 2896 messages) for you. Estimated reading time saved (at 200wpm): 341 minutes. You can now tag @smol_ai for AINews discussions!

Arvind Narayanan et al published a paper about how Agent papers are mostly not reproducible and ignore cost, Meta published a text-to-3D assets model, Magic.dev and Poolside are code model companies seeking unicorn rounds, OpenDevin is now a company, Kyutai released a realtime Audio LLM that maybe doesn't work as advertised, Peter Thiel backed some AGI Blockchain thing, The New Stack published one and two writeups of AIEWF.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

AI Model Releases and Updates

Research Papers and Techniques

Frameworks and Tools

Discussions and Perspectives


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

AI Models & Techniques

AI Video & Animation

AI Ethics & Societal Impact

Miscellaneous


AI Discord Recap

A summary of Summaries of Summaries

  1. Real-Time AI Models Steal the Spotlight:

    • Kyutai Labs launched Moshi, a 7B multimodal model for real-time text and audio generation with 160ms response times, garnering excitement for its open-source availability and rapid interactions (albeit a bit robotic), showcasing during a demo session with plans to address minor bugs.
    • The Phi-3 Mini model received a major update akin to a 3.5 Mini, with upcoming Gemma 2 support, but users noted startup issues reflecting the integration challenges of cutting-edge AI tools.
  2. Optimizing AI Deployment and Memory Management:

    • Extensive discussions on Colab and Kaggle notebooks shared best practices for memory management with methods like gc.collect() and torch.cuda.empty_cache(). Scaling LoRA rank for models based on dataset size was debated, emphasizing optimization via efficient resource handling.
    • Gemma 2 support enhancements for tools like Unsloth and LM Studio improve finetuning speed significantly, with Unsloth achieving 2x faster finetuning and 63% less memory usage, while LM Studio’s 0.2.27 update solved compatibility issues on Mac, Windows, and Linux.
  3. Innovations in AI Model Training and Fine-Tuning:

    • QLoRA was highlighted for its efficient finetuning of quantized LLMs, enabling finetuning of 65B parameter models on 48GB GPUs with near 16-bit precision performance using 4-bit quantization, as detailed in the QLoRA paper.
    • Members delved into optimizing CUDA operations with tools like DeepSpeed and Inductor backend for Nvidia, focusing on autotuning GEMM backends and troubleshooting torch.cuda.OutOfMemoryError, reinforcing the importance of hardware-informed optimizations.
  4. Privacy, Security, and Ethical Considerations in AI:

    • Concerns over data policy enforcement led to critical discussions on OpenAI’s GPT-4 subscription pricing and sporadic model parameter adjustments affecting user experience. Issues like dataset removal due to minor policy breaches sparked debates on enforcement consistency vs. user needs.
    • Discussions on anti-AI art software like Glaze and Nightshade raised ethical questions about balancing copyright protection and technological progress, highlighting community frustrations over potential circumvention of protective tools.
  5. Community Tools, Tutorials, and Collaboration:

    • Users shared various open-source tools and tutorials, such as creating custom pipelines with Transformers and Gradio apps for role-play prompts, fostering collaborative learning and practical implementation.
    • Docker image development for AI tools like AI Town saw active community participation, focusing on simplifying setup processes and ensuring compatibility with various platforms via detailed PRs and documentation submissions on GitHub.

PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


OpenAI Discord


LM Studio Discord


HuggingFace Discord


Eleuther Discord


Perplexity AI Discord


CUDA MODE Discord


Stability.ai (Stable Diffusion) Discord


OpenRouter (Alex Atallah) Discord


Latent Space Discord


LAION Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


Nous Research AI Discord


Modular (Mojo 🔥) Discord


Interconnects (Nathan Lambert) Discord


LangChain AI Discord


Mozilla AI Discord


Torchtune Discord


Cohere Discord


OpenInterpreter Discord


OpenAccess AI Collective (axolotl) Discord


AI Stack Devs (Yoko Li) Discord


LLM Finetuning (Hamel + Dan) Discord


LLM Perf Enthusiasts AI Discord


Datasette - LLM (@SimonW) Discord


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (201 messages🔥🔥):

  • SEQ_CLS support in unsloth
  • Using jellyfin as an alternative to Plex
  • Fine-tuning models on another language
  • Sharing Colab notebooks and account information
  • VRAM requirements for LORA fine-tuning

Links mentioned:


Unsloth AI (Daniel Han) ▷ #announcements (1 messages):

  • Gemma 2 release
  • Phi 3 mini update
  • Finetuning improvements
  • Increased context lengths
  • Notebooks and 4-bit models

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (3 messages):

  • mahiatlinux's comment
  • theyruinedelise's reaction
  • response from mahiatlinux

Unsloth AI (Daniel Han) ▷ #help (63 messages🔥🔥):

  • Llama.cpp quantization issues
  • Loading model speed on Colab
  • Unsloth compatibility with Gemma 2
  • Inference on CPU after fine-tuning with Unsloth
  • Training issues with Huggingfaces SFTTrainer and Unsloth

Links mentioned:


Unsloth AI (Daniel Han) ▷ #community-collaboration (241 messages🔥🔥):

  • Tutorial/Intermediate Colab/Kaggle notebook with more dataset support
  • Improvements and suggestions for the community notebook
  • Memory management and optimization techniques for notebooks
  • Text classification optimized notebook for Unsloth
  • Secret management in Docker and application deployment

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (10 messages🔥):

  • Issues with using Unsloth in a local system
  • Gemma2 update release timeline
  • Support for the latest Gemma
  • Discussion about Gemma
  • Evaluation of Java with PHI

OpenAI ▷ #ai-discussions (194 messages🔥🔥):

  • Issues with OpenAI GPT-4 subscription and performance
  • AI21's Jamba model announcement and discussion
  • User experiences with AI for coding and programming
  • Live and open-source AI models debate
  • AI for real-time conversations: Moshi demo

Links mentioned:


OpenAI ▷ #gpt-4-discussions (11 messages🔥):

  • New TTS model voices availability
  • Data policy enforcement
  • Running ChatGPT in command prompt with Google search capability
  • Subscription pricing frustrations
  • Nested GPTs functionality

OpenAI ▷ #prompt-engineering (117 messages🔥🔥):

  • Issues with GPT models not answering questions as expected
  • Difficulties with creating PDF documents using GPT APIs
  • Improving prompt engineering for better task performance
  • Challenges with AI-driven image generation using DALL-E
  • Developing an employee recognition program using AI prompts

OpenAI ▷ #api-discussions (117 messages🔥🔥):

  • GPT performance issues
  • Improving prompt structure and attention control
  • Converting documents into PDF with product tags
  • Enhancing an AI icon generator
  • Developing an employee recognition program

LM Studio ▷ #💬-general (166 messages🔥🔥):

  • Discussing technical issues and updates related to LM Studio.
  • Comparing different AI models like Gemma 2, Llama 3, and Mistral.
  • Enhancements and bugs in the Gemma 2 model including tokenizer and attention mechanisms.
  • Difficulty and recommendations for fine-tuning LLMs using LM Studio.
  • Best models and configurations for different hardware setups.

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (52 messages🔥):

  • dolphin-vision compatibility in LM Studio
  • Gemma 2 model performance and issues
  • System and hardware requirements for running large models
  • RP stress testing for AI models
  • Code generation capabilities of Gemma 2

LM Studio ▷ #announcements (1 messages):

  • LM Studio 0.2.27 Release
  • Improved Gemma 2 Support
  • Bug Fixes in lmstudio.js
  • Advanced Information on lmstudio.js

Links mentioned:


LM Studio ▷ #🧠-feedback (11 messages🔥):

  • Basic functionality issue with drive installation
  • User confusion on model insertion
  • Scaling issues on 1080p monitors
  • Unsupported architecture message in new models
  • User feedback on improving LM Studio interfaces

LM Studio ▷ #📝-prompts-discussion-chat (3 messages):

  • Prompting Llama 3 70B to remove conversational lines
  • Issue with prompt results on Llama 3 compared to Qwen2 72B
  • Creation of a prompt tool with Gradio app for role-play and character immersion

Link mentioned: System Roleplay Generator - a Hugging Face Space by xtreme86: no description found


LM Studio ▷ #🎛-hardware-discussion (3 messages):

  • Energy consumption of LMS
  • Hardware issues on Linux vs Windows
  • GPU usage comparison of LMS with other software
  • User experiences with different GPU setups
  • Potential future fixes for LMS energy consumption

LM Studio ▷ #amd-rocm-tech-preview (14 messages🔥):

  • Gemma 2 loading issue
  • ROCM GPU compatibility and performance
  • Linux ROCm extension pack testing

LM Studio ▷ #model-announcements (1 messages):

  • Gemma 2 model update on Huggingface
  • Compatibility updates for Gemma 2 models

LM Studio ▷ #🛠-dev-chat (70 messages🔥🔥):

  • gpuOffload value discussion
  • bot configuration issues with TypeScript and Discord.js

Links mentioned:


HuggingFace ▷ #announcements (1 messages):

  • New fine-tunes for Transformers models with KerasNLP
  • Experimental API for searching HF datasets by column names
  • Transformers 4.42 release with new features and models
  • Nearly 100k public models on HF Hub storing tensorboard logs
  • Local Gemma release

Links mentioned:


HuggingFace ▷ #general (236 messages🔥🔥):

  • Joining Hugging Face Discord Community
  • Adept Strategy Shift & Co-Founders Joining Amazon
  • AI Models for Text and Image Processing
  • Performance and Accuracy of Hugging Face Models
  • Suggestions for ML Certifications

Links mentioned:


HuggingFace ▷ #today-im-learning (7 messages):

  • advanced resources for CNN topics like ViT and Unets
  • request for more tutorials on torch.distributed
  • Gradio app for role-play and character immersion prompt creation
  • TIL about '|' and '&' operators for Sets and Dicts in Python
  • question about Bayes' theorem in French

Links mentioned:


HuggingFace ▷ #cool-finds (8 messages🔥):

  • attention mechanism
  • transformer architecture
  • compatibility in shells
  • sequence transduction models
  • demo video reactions

Links mentioned:


HuggingFace ▷ #i-made-this (7 messages):

  • OpenAI's CriticGPT Release
  • Stable Release of Embodied Agents Toolkit
  • Open Source OCR for Kazakh Language
  • Blog on Reinforcement Learning Specialization
  • Zero-Shot Generating Spatial Sound from Images

Links mentioned:


HuggingFace ▷ #reading-group (4 messages):

  • Highway Net vs ResNet performance
  • Gradient vanishing problem in LSTM
  • Multi-branch structure inspiration from LSTM
  • Pre-trained models and fine-tuning techniques
  • topicSummaries

Link mentioned: FLoRA: Low-Rank Core Space for N-dimension: Adapting pre-trained foundation models for various downstream tasks has been prevalent in artificial intelligence. Due to the vast number of tasks and high costs, adjusting all parameters becomes unfe...


HuggingFace ▷ #computer-vision (6 messages):

  • ADVANCED_CNN_RESOURCES
  • Neighborhood_Attention_Transformer_usage
  • Developer_Job_Openings
  • MaskFormer_training_issues
  • Lightweight_AI_for_programming

Link mentioned: Neighborhood Attention Transformer: no description found


HuggingFace ▷ #NLP (9 messages🔥):

  • Custom pipeline creation
  • Text summarization models with high max input token length
  • Performance of open-source models vs. ChatGPT
  • Challenges in downloading and using Meta LLaMA
  • Inference freezing issues with Mistral model

Link mentioned: llm-course/transformers/custom-pipeline.md at main · andysingal/llm-course: Contribute to andysingal/llm-course development by creating an account on GitHub.


HuggingFace ▷ #diffusion-discussions (1 messages):

  • Discussion on running RealVisXL V4.0 Lightning model with diffusers.
  • Comparison of quality between A1111 and diffusers.
  • Support on Boosty.
  • Recommended negative prompt and generation parameters.
  • Issues with model performance during training phase.

Link mentioned: SG161222/RealVisXL_V4.0_Lightning · Hugging Face: no description found


Eleuther ▷ #general (93 messages🔥🔥):

  • GPT-4 parameter discussion
  • Nvidia involvement in GPT-4 development and leaks
  • Mixture of Experts (MoE) models
  • InstructGPT efficiency
  • Discord server scraping and ToS violations

Link mentioned: AI News: We summarize top AI discords + AI reddits + AI X/Twitters, and send you a roundup each day! See archive for examples. "Highest-leverage 45 mins I spend everyday" - Soumith "best AI new...


Eleuther ▷ #research (76 messages🔥🔥):

  • UL2 vs traditional training objectives
  • Starcoder2 and UL2 performance
  • PrefixLM and training implications
  • Scaling laws and learning rate schedules
  • FIM and UL2 comparisons

Links mentioned:


Eleuther ▷ #scaling-laws (22 messages🔥):

  • Discrepancy in compute optimal scaling laws between Kaplan et al. and Hoffmann et al.
  • Kaplan et al.'s last layer computational cost, warmup duration, and scale-dependent optimizer tuning
  • Attention flops and the 6ND approximation in scaling law computations
  • PyTorch flop counter utility and FLOPs calculation methodologies
  • Chinchilla paper scaling law and extrapolation issues

Links mentioned:


Eleuther ▷ #interpretability-general (2 messages):

  • EAP with integrated gradients
  • Methods for discovering and applying sparse feature circuits
  • Generalization improvement using SHIFT
  • Scalable interpretability pipeline for sparse feature circuits

Link mentioned: Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models: We introduce methods for discovering and applying sparse feature circuits. These are causally implicated subnetworks of human-interpretable features for explaining language model behaviors. Circuits i...


Eleuther ▷ #lm-thunderdome (26 messages🔥):

  • PR confirmation and lm-eval reference
  • Loglikelihood_rolling functionality and usage
  • Handling document length longer than model's context in perplexity evaluations
  • Errors in model evaluation with specific configurations
  • Preprocessing functions and pipeline consistency

Links mentioned:


Perplexity AI ▷ #general (174 messages🔥🔥):

  • Discussion About Trying Gemini 1.5 Pro
  • Access Issues with GPT4o
  • New Perplexity Features on Mobile
  • Refund Process for Pro Subscription
  • Concerns About Perplexity's Live Internet Access

Link mentioned: gateway/cookbook/integrations/Phidata_with_ Perplexity.ipynb at main · Portkey-AI/gateway: A Blazing Fast AI Gateway. Route to 200+ LLMs with 1 fast & friendly API. - Portkey-AI/gateway


Perplexity AI ▷ #sharing (9 messages🔥):

  • Lean Canvas Guide
  • Starting Perplexity AI Story
  • Building a Blackbox
  • OpenSSH Query
  • Sober Living in Echo Park

Perplexity AI ▷ #pplx-api (7 messages):

  • Usage of Sonnet 3.5 with Perplexity API
  • Availability of Sonnet in Perplexity API
  • List of available models in Perplexity API
  • Search engine usage via Perplexity API
  • Issues with llama-3-sonar-large-32k-online model

Link mentioned: Supported Models: no description found


CUDA MODE ▷ #general (19 messages🔥):

  • CUDA-only hackathon at the AGI House in San Francisco
  • Meta Hacker Cup 2024 schedule
  • Discussion about the price and purchase of NVIDIA GPUs (3090, 4090)

Links mentioned:


CUDA MODE ▷ #torch (14 messages🔥):

  • Steps involved in compiling a function in Pytorch with Inductor backend for Nvidia device
  • Difference between triton IR and MLIR
  • John Carmack's positive feedback on PyTorch team and contributing to open source
  • Issue with forcing Inductor to generate Triton kernels for GEMM and Conv

Links mentioned:


CUDA MODE ▷ #cool-links (8 messages🔥):

  • High-performance matrix multiplication on CPU
  • 3D V-Cache performance on AMD Ryzen
  • Difference between 3D and non-3D Ryzen chips
  • Discussion on specialization of 3D V-Cache chips
  • Simulation benchmarks for CPUs

Links mentioned:


CUDA MODE ▷ #torchao (31 messages🔥):

  • Loading a buffer containing int4 using torchao
  • Saving a tensor into a safetensors file
  • Dequantizing tensors using torchao
  • Handling packed int4 arrays in Python
  • torchao's handling of unexpected keyword arguments

Link mentioned: Gemini-Nano/playground/converter.py at master · ethanc8/Gemini-Nano: Contribute to ethanc8/Gemini-Nano development by creating an account on GitHub.


CUDA MODE ▷ #llmdotc (84 messages🔥🔥):

  • Memory efficiency comparison with PyTorch
  • Visualization of model weights and training issues
  • New GitHub PRs and bug fixes
  • Experiments with and observations on muP
  • Schedule-free optimization discussion

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (145 messages🔥🔥):

  • Anti-AI art software debate
  • Tips for low-resolution pixel art training
  • Job postings on Discord
  • Improving prompt techniques and comparisons
  • Comparisons between SD models *MixofExperts* and segmoe

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

  • Big update to /models page
  • Changing Google Token Sizes for Gemini and PaLM models
  • Deprecation of Default Model in settings page
  • Deprecation of custom auth headers for OpenAI API keys

OpenRouter (Alex Atallah) ▷ #app-showcase (3 messages):

  • Quick and dirty wrapper shared by lastrosade
  • Feedback on the non-streamed response

OpenRouter (Alex Atallah) ▷ #general (77 messages🔥🔥):

  • 500 errors with Claude 3.5
  • Self-moderation issues with Claude
  • Different frontends for using and jailbreaking Claude
  • OpenRouter privacy settings and logging policies
  • Google models token size change announcement

Links mentioned:


Latent Space ▷ #ai-general-chat (40 messages🔥):

  • Magic.dev $500M to $1.5B valuation, 20 employees, no product, no revenue.
  • New paper on persona-driven data synthesis with 1 billion personas.
  • First real-time Audio LLM by Kyutai, 'Moshi'.
  • OpenDevin founders start All Hands AI.
  • Sentient's $85M seed round for open AGI platform.

Links mentioned:


Latent Space ▷ #llm-paper-club-west (34 messages🔥):

  • openai AV issues during AIEWF demo
  • migration to Zoom for better accessibility
  • Discord's incompatibility with Linux and proposed alternatives

Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ...


LAION ▷ #general (2 messages):

  • jaan.li introducing their work at onefact.org and usb.club
  • san.tosh inquiring about updates on open GPT-4o

LAION ▷ #research (59 messages🔥🔥):

  • Ablation tests and justification of changes for Terminator models
  • Discussions on slow-fast networks and their advantages
  • Released code for Terminator on GitHub
  • Introduction of FORA for accelerating Diffusion transformers
  • Critiques and suggestions for the HyperZ⋅Z⋅W paper

Links mentioned:


tinygrad (George Hotz) ▷ #general (26 messages🔥):

  • Image dtype special treatment
  • Runtime error in tinygrad
  • UNMUL pattern matcher issue
  • Frontend fuzzer idea
  • Loop optimization bug

tinygrad (George Hotz) ▷ #learn-tinygrad (33 messages🔥):

  • Equivalent of torch.no_grad() in tinygrad
  • ``-= operator incompatibility with gradient enabled in tinygrad
  • Handling gradient accumulation issues leading to CUDA memory errors
  • Slowdown and memory issues with TinyJit during gradient accumulation
  • Behavior of Tensor creation methods in tinygrad vs PyTorch

Link mentioned: tinygrad/tinygrad/tensor.py at master · tinygrad/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ❤️ - tinygrad/tinygrad


LlamaIndex ▷ #blog (3 messages):

  • Building a RAG pipeline on Raspberry Pi
  • OpenContracts AI-powered document analytics tool
  • Webinar on RAG experimentation and evaluation with Weights & Biases

LlamaIndex ▷ #general (49 messages🔥):

  • DocumentSummaryIndex issues with Pinecone limits
  • Code snippets and potential fixes for metadata exclusion
  • Alternative vector stores to Pinecone
  • LlamaIndex's single LLM support
  • Parsing issues with PDF tables

Links mentioned:


LlamaIndex ▷ #ai-discussion (5 messages):

  • Agentic RAG with LlamaIndex, Claude-3.5 Sonnet, and MongoDB
  • Toolio for running private AI/LLM agents and tool-calling workflows on Mac

Nous Research AI ▷ #off-topic (1 messages):

  • Tortoise-TTS converted to ggml
  • Optimization for real-time inference
  • Open-source projects on GitHub
  • CUDA and CPU support for Tortoise-TTS

Link mentioned: GitHub - balisujohn/tortoise.cpp: A ggml (C++) re-implementation of tortoise-tts: A ggml (C++) re-implementation of tortoise-tts. Contribute to balisujohn/tortoise.cpp development by creating an account on GitHub.


Nous Research AI ▷ #general (42 messages🔥):

  • Private channels on decentralized training
  • Tool calling in vLLM for Hermes 2 Pro
  • Discussion on handling tool calls and text content in Hermes 2 Pro

Link mentioned: Neural Networks: Zero to Hero: no description found


Nous Research AI ▷ #ask-about-llms (3 messages):

  • Creating conversational dataset from documents
  • Instruction-generation from documents
  • Genstruct 7B model for generating instructions

Link mentioned: NousResearch/Genstruct-7B · Hugging Face: no description found


Nous Research AI ▷ #rag-dataset (5 messages):

  • Huggingface's PR on Cohere's CommandR model
  • Microsoft's GraphRAG release

Link mentioned: GitHub - microsoft/graphrag: A modular graph-based Retrieval-Augmented Generation (RAG) system: A modular graph-based Retrieval-Augmented Generation (RAG) system - microsoft/graphrag


Modular (Mojo 🔥) ▷ #general (7 messages):

  • Installation issues on Ubuntu 24.04/Python 3.12.3
  • Implementation workaround for Mojo/max on Ubuntu 24.04/Python 3.12.3
  • Mojo implicit conversion bug
  • Casting bug in Mojo

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):

  • ``

Modular (Mojo 🔥) ▷ #✍︱blog (1 messages):

  • Mojo N-Body Example Benchmark
  • Single-core Numeric Performance in Mojo
  • Symplectic Integrator in N-body.js
  • Vectorization in N-Body Example
  • Ordinary Differential Equation Solver

Link mentioned: Modular: A Brief Guide to the Mojo N-Body Example: We are building a next-generation AI developer platform for the world. Check out our latest post: A Brief Guide to the Mojo N-Body Example


Modular (Mojo 🔥) ▷ #🔥mojo (8 messages🔥):

  • Mojo List types and printing issues
  • Printing RepresentableCollectionElement types
  • Printing errors inline in Mojo
  • Impact of excessive empty lines on startup time
  • Variable startup times in Mojo programs due to loop unrolling in bench_matmul

Links mentioned:


Modular (Mojo 🔥) ▷ #nightly (1 messages):

melodyogonna: The joys of early tooling


Modular (Mojo 🔥) ▷ #mojo-marathons (31 messages🔥):

  • Parallel processing in MLM
  • Matrix Multiplication Optimization
  • Strassen Algorithm Performance
  • SPIRAL Project
  • Numerical Stability in Matrix Multiplication

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (29 messages🔥):

  • Apple getting a board observer seat at OpenAI
  • Microsoft investments in OpenAI and comparison with Apple's partnership
  • Kyutai Labs' new real-time audio LLM 'Moshi'
  • Training details and technical specifics of 'Moshi'
  • Kyutai Labs' open model releases and future plans

Links mentioned:


Interconnects (Nathan Lambert) ▷ #posts (12 messages🔥):

  • SB 1047 and First Amendment challenges
  • Protection of 3D gun designs under the First Amendment
  • Model weights and code as protected speech
  • Claude 3.5 admiration
  • Use Claude TM

Link mentioned: Deeplinks Blog: no description found


LangChain AI ▷ #general (30 messages🔥):

  • RAG strategies for answering general questions
  • Rate limiting issues with AzureAIDocumentIntelligenceLoader
  • Parsing PDFs with different tools and libraries
  • LangSmith tracing issues
  • General help and troubleshooting in LangChain

Links mentioned:


LangChain AI ▷ #langserve (2 messages):

  • uploading CSV files directly vs. providing file path
  • output not displaying in CSV playground
  • code improvement for CSV handling
  • FastAPI endpoint for file uploads
  • Chroma vectorstore usage and issues

LangChain AI ▷ #share-your-work (2 messages):

  • OpenAI CriticGPT paper discussion
  • Toolio open source project for private LLMs

Link mentioned: OpenAI releases CriticGPT to correct GPT-4's mistakes | Read the paper with me: OpenAI has unveiled CriticGPT, a new AI model based on GPT-4 designed to identify errors in code generated by ChatGPT, marking a significant step towards imp...


LangChain AI ▷ #tutorials (1 messages):

dracount: hi, is there a beginner langchain/langraph tutorial that anyone can recommend?


Mozilla AI ▷ #llamafile (25 messages🔥):

  • Hardware recommendations for running llamafile
  • VRAM and CPU usage for large language models
  • Syncthread trick for CPU inference
  • Running llama3 70B on high-end workstation
  • Issues with Rockchip RK3588 NPUs support for llamafile

Links mentioned:


Torchtune ▷ #general (22 messages🔥):

  • phi mini new weights same repo
  • torchtune evaluation using eleutherai's eval harness
  • evaluation basics and logs on wandb
  • discussion about gradients and epochs for training
  • FullModelHFCheckpointer and conversion between HF format and torchtune

Links mentioned:


Cohere ▷ #general (14 messages🔥):

  • Training LLM with Stockfish data
  • Usage of tools like Stockfish for reasoning in LLMs
  • GitHub notebook code
  • Chess strategy and LLMs
  • Cohere API tools

Link mentioned: GitHub: Let’s build from here: GitHub is where over 100 million developers shape the future of software, together. Contribute to the open source community, manage your Git repositories, review code like a pro, track bugs and fea...


Cohere ▷ #project-sharing (5 messages):

  • Creation of a Cohere Slack bot
  • Discussion on Slack's request handling
  • Lack of documentation on Slack integration
  • Offer to share script and create documentation

OpenInterpreter ▷ #general (18 messages🔥):

  • Kyutai Moshi - real-time Audio LLM
  • Various Open Interpreter compatible projects
  • Experience modding games for Open Interpreter
  • Pull request for Open Interpreter Labs
  • MIke Bird and blurryboi discussion on Kyutai Moshi

Links mentioned:


OpenInterpreter ▷ #O1 (1 messages):

johnlenflure: Isn't there a way to integrate 01 into glasses?


OpenAccess AI Collective (axolotl) ▷ #general (1 messages):

  • Weighted cross entropy in the trainer

OpenAccess AI Collective (axolotl) ▷ #general-help (6 messages):

  • Differences between LoRA and QLoRA quantization
  • Explanation of 8-bit quantization in LoRA
  • Efficiency of QLoRA in fine-tuning large models

Link mentioned: QLoRA: Efficient Finetuning of Quantized LLMs: We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLo...


OpenAccess AI Collective (axolotl) ▷ #axolotl-help-bot (2 messages):

  • torch.cuda.OutOfMemoryError on Google Colab
  • Axolotl running issues
  • GPU memory allocation
  • VRAM requirements

OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (5 messages):

  • Quantization and its impact on model performance
  • LoRA versus QLoRA configuration specifics
  • Memory footprint and inference speed improvements with 8-bit quantization

Link mentioned: OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.


AI Stack Devs (Yoko Li) ▷ #ai-town-discuss (11 messages🔥):

  • Docker port for AI Town
  • GitHub Page for AI Town Windows Setup with WSL
  • API communication issues with Docker port of AI Town
  • Convex automatic download via Docker for AI Town
  • Testing Docker integration for AI Town

Link mentioned: GitHub - Ikkitsuna/AI-Town-Windows-Setup-WSL-method: Guide for setting up AI Town on Windows using WSL: Guide for setting up AI Town on Windows using WSL. Contribute to Ikkitsuna/AI-Town-Windows-Setup-WSL-method development by creating an account on GitHub.


LLM Finetuning (Hamel + Dan) ▷ #🟩-modal (3 messages):

  • struggling to deploy RAG app using gradio on Modal
  • post on Modal Slack for help

LLM Finetuning (Hamel + Dan) ▷ #paige_when_finetune (1 messages):

shamik_53759: Yep, it's up now. Thanks!


LLM Finetuning (Hamel + Dan) ▷ #axolotl (1 messages):

  • DeepSpeed configuration for data sharding and disabling model sharding
  • Assistance with DeepSpeed configurations
  • Confusion about DeepSpeed settings

LLM Finetuning (Hamel + Dan) ▷ #freddy-gradio (1 messages):

  • Sharing private code deployments
  • Running code on multiple platforms
  • Limitations of private deployments on Hugging Face

LLM Perf Enthusiasts AI ▷ #eval (1 messages):

  • Evaluating LLM accuracy in legal contract review
  • Screens tool achieving 97.5% accuracy
  • Methodologies for assessing LLMs in the legal domain
  • Impact of different LLMs and methods on AI accuracy in legal tasks

Link mentioned: Screens Accuracy Evaluation Report: Evaluating the accuracy of large language models (LLMs) on contract review tasks is critical to understanding reliability in the field. However, objectivity is a challenge when evaluating long form, f...


LLM Perf Enthusiasts AI ▷ #prompting (1 messages):

  • ``

Datasette - LLM (@SimonW) ▷ #ai (1 messages):

derekpwillis: https://thescoop.org/archives/2024/06/22/all-foreign-gifts-around-us/index.html





{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}