Frozen AI News archive

Gemma 2: The Open Model for Everyone

**Gemma 2**, a **27B** parameter model from **google-deepmind**, was released with innovations like 1:1 local-global attention alternation and logit soft-capping, leveraging **knowledge distillation** to train smaller models on over 50× the compute-optimal token quantity. The model supports multilingual and multimodal capabilities, with fine-tuning success on over 200 Indic language variants. The **Open LLM Leaderboard** highlights **alibaba's Qwen 72B** as the top model, with **mistral-ai's Mixtral-8x22B-Instruct** also ranking highly. **Anthropic** launched **Claude 3.5 Sonnet**, improving intelligence at mid-tier cost and speed. Research on eliminating matrix multiplication in LLMs promises significant memory savings without performance loss. *Kathleen Kenealy* and *Daniel Han* provided insights on Gemma 2's tokenizer and attention scaling respectively.

Canonical issue URL

AI News for 6/26/2024-6/27/2024. We checked 7 subreddits, 384 Twitters and 30 Discords (416 channels, and 2698 messages) for you. Estimated reading time saved (at 200wpm): 317 minutes. You can now tag @smol_ai for AINews discussions!

Gemma 2 is out! Previewed at I/O (our report), it's out now, with the 27B model they talked about, but curiously sans 2B model. Anyway, it's good, of course, for its size - does lower in evals than Phi-3, but better in ratings on LMSys, just behind yi-large (which also launched at the World's Fair Hackathon on Monday):

image.png

We have some small hints as to what the drivers might be:

But of course, data is the elephant in the room; and here the story has been KD:

In particular, we focus our efforts on knowledge distillation (Hinton et al., 2015), which replaces the one-hot vector seen at each token with the distribution of potential next tokens computed from a large model.

This approach is often used to reduce the training time of smaller models by giving them richer gradients. In this work, we instead train for large quantities of tokens with distillation in order to simulate training beyond the number of available tokens. Concretely, we use a large language model as a teacher to train small models, namely 9B and 2.6B models, on a quantity of tokens that is more than 50× the compute-optimal quantity predicted by the theory (Hoffmann et al., 2022). Along with the models trained with distillation, we also release a 27B model trained from scratch for this work.

At her World's Fair talk on Gemma 2, Gemma researcher Kathleen Kenealy also highlighted the Gemini/Gemma tokenizer:

"while Gemma is trained on primarily English data the Gemini models are multimodal they're multilingual so this means the Gemma models are super easily adaptable to different languages. One of my favorite projects we saw it was also highlighted in I/O was a team of researchers in India fine-tuned Gemma to achieve state-of-the-art performance on over 200 variants of indic languages which had never been achieved before."

Fellow World's Fair speaker Daniel Han also called out the attention-scaling that was only discoverable in the code:

image.png


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

AI Models and Architectures

Tools, Frameworks and Platforms

Memes and Humor


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

AI Progress and Capabilities

Memes and Humor

Other AI and Tech News


AI Discord Recap

A summary of Summaries of Summaries

Claude 3.5 Sonnet

  1. Google's Gemma 2 Makes Waves:

  2. Meta's LLM Compiler Announcement:

    • New Models for Code Tasks: Meta introduced LLM Compiler models built on Meta Code Llama, focusing on code optimization and compiler capabilities. These models are available under a permissive license for research and commercial use.
  3. Benchmarking and Leaderboard Discussions:

    • Unexpected Rankings: The Open LLM Leaderboard saw surprising high rankings for lesser-known models like Yi, sparking discussions about benchmark saturation and evaluation metrics across multiple Discord communities.
  4. AI Development Frameworks and Tools:

    • LlamaIndex's Multi-Agent Framework: LlamaIndex announced llama-agents, a new framework for deploying multi-agent AI systems in production with distributed architecture and HTTP API communication.

    • Figma AI Free Trial: Figma AI is offering a free year, allowing users to explore AI-powered design tools without immediate cost.

  5. Hardware Debates for AI Development:

    • GPU Comparisons: Discussions across Discord servers compared the merits of NVIDIA A6000 GPUs with 48GB VRAM against setups using multiple RTX 3090s, considering factors like NVLink connectivity and price-performance ratios.

    • Cooling Challenges: Users in multiple communities shared experiences with cooling high-powered GPU setups, reporting thermal issues even with extensive cooling solutions.

  6. Ethical and Legal Considerations:

    • AI-Generated Content Concerns: An article about Perplexity AI citing AI-generated sources sparked discussions about information reliability and attribution across different Discord servers.

    • Data Exclusion Ethics: Multiple communities debated the ethics of excluding certain data types (e.g., child-related) from AI training to prevent misuse, balanced against the need for model diversity and capability.

Claude 3 Opus

1. Advancements in LLM Performance and Capabilities

2. Open-Source AI Frameworks and Community Efforts

3. Optimizing LLM Training and Inference

4. Multimodal AI and Generative Modeling Innovations

GPT4O (gpt-4o-2024-05-13)

  1. LLM Deployment and Training Optimization:

    • Hurdles in AI Deployment Leave Engineers Frustrated: Engineers shared challenges in deploying custom models efficiently, with discussions focused on avoiding weights errors and optimizing parameters for hardware like the RTX 4090 using tools like Koboldcpp.

    • Diving Into Flash Attention: Members requested tutorials on Flash Attention, an efficient technique for memory management in models, highlighting the need for better understanding of this optimization.

  2. Benchmarking and Performance Evaluation:

    • Yi Takes LLM Leaderboard by Storm: The Open LLM Leaderboard sparked interest as models like Yi surprisingly rose to top ranks, challenging engineers to reassess their models' performances.

    • Gemma 2's Mixed Reactions: Excitement and skepticism surrounded Gemma 2—while some praised its innovations, others were unsure if it marked a significant leap. Comparisons with existing models were fueled by benchmark analyses.

  3. Open-Source AI Frameworks and Tools:

    • LlamaIndex Introduces llama-agents: LlamaIndex announced llama-agents, a multi-agent AI framework aiming to streamline production deployments; it includes distributed architecture and HTTP API communication.

    • LangChain AI Discusses Endpoint Building: Engineers shared examples of building LangChain endpoints with documentation showing proper use of load_qa_chain() and handling high-volume requests.

  4. AI Licensing and Ethical Considerations:

    • AI Training Ethics Stir Heated Debate: Engineers in LAION deliberated over ethical training practices, debating whether to exclude child-related data to prevent misuse, while balancing the impact on model diversity and normal scene generation.

    • Skepticism Towards AI Licensing Models: Legal and practical concerns arose around the exclusive Command-R model via OpenRouter, examining potential licensing misuse and enforcing compliance.

  5. Cutting-Edge AI Models and Innovations:

    • Meta Unveils LLM Compiler Models: Meta introduced the Meta LLM Compiler focusing on code optimization, with models built on extensive token corpuses for advanced compiler tasks.

    • Innovative SPARSEK Attention Mechanism: The SPARSEK Attention mechanism promises efficient long-sequence processing with linear complexity, as detailed in a new paper, aiming to overcome typical self-attention limitations.

  6. Misc

    • Mojo Compiles and Executes Models with Ease: Community members discussed Mojo language challenges, highlighting object identity and self-referential type issues and the need for thorough GitHub documentation.

    • Storage Requirements for Large Models Revealed: Insights shared in Nous Research AI discussed the necessary hardware for running models like DeepCoder V2, indicating that substantial RAM and VRAM are required for efficient performance.


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord

Yi Tops LLM Leaderboard: New benchmarks have placed lesser-known models like Yi at surprising high ranks in the LLM leaderboard, intriguing the AI community.

Rollout of Gemma 2 Stirs Excitement and Skepticism: The release of Gemma 2 has sparked enthusiasm and curiosity, particularly around its similarities with Grok. Notably, a tweet dissecting Gemma 2's innovations became a focal point, despite some users questioning if the advancements mark a significant leap from previous models.

Hurdles in AI Deployment and Training: Discussions pointed to challenges and solutions in deploying custom models, with an emphasis on avoiding weights errors. AI engineers shared insights about saving and serving models using Ollama and suggested parameters adjustments for optimization on hardware like the RTX 4090, citing specific tools like Koboldcpp.

Bugs and Support Discussed Ahead of the AI World's Fair: The Unsloth AI team is gearing up for the AI World's Fair, planning to discuss open-source model issues and the new inclusion of @ollama support, as announced in this tweet.

The Heat on ChatGPT: ChatGPT became a contentious topic, with some community members calling it "literally fucking garbage" while others acknowledged its role in paving AI's path, despite ChatGPT 3.5's accuracy issues. Problems with AI hardware overheating were also humorously lamented.


HuggingFace Discord

Multimodal RAG on the Horizon: Excited chatter surrounded the development of a multimodal RAG article with anticipation for a groundbreaking outcome; however, the specifics such as models or results were not discussed.

Entity Extraction Tools Evaluated: Technical discussion identified shortcomings of BERT for NER, with members suggesting alternatives like GLiNER and NuExtract, which are touted for their flexibility in extracting non-predefined entities, pointing to community resources like ZeroGPU Spaces.

Skeptical Reception for Sohu AI Chip: The community shared cautious skepticism regarding the claimed performance of Sohu's new AI chip, with members considering experimentation on Sohu's advertised service, despite no direct experience shared.

Efficient Dynamic Diffusion Delivery: Strategies for enhancing the performance of stable diffusion models were enthusiastically exchanged, notably including "torch compile" and leveraging libraries such as Accelerate and stable-fast for improved inference times.

AI Leaderboard Reflections: The Open LLM Leaderboard blog spurred concerns about saturation in AI benchmarks, reflecting a sentiment for the community's drive for continuous improvement and new benchmarks.


OpenAI Discord

GPT Sibling Rivalry: CriticGPT emerges to squash bugs in GPT-4’s code, boasting integration into OpenAI's RLHF pipeline for enhanced AI supervision, official announcement details.

Claude vs. GPT-4o - The Context Window Showdown: Claude 3.5 Sonnet is lauded for its coding prowess and expansive context window, overshadowing GPT-4o, which some claim lacks real omnimodal capabilities and faces slow response times.

Beyond Traditional Text Chats: Innovators employ 3.5 Sonnet API and ElevenLabs API to drive real-time conversation, challenging the necessity of ChatGPT in certain contexts.

Prompt Engineering Puzzles and Pitfalls: Users exchange methods for few-shot prompting and prompt compression, with an eye on structuring prompts in YAML/XML for precision, and experimenting with "Unicode semiotics" for token-efficient prompts.

Navigating the API Labyrinth: Discussions focused on calculating prompt costs, seeking examples of knowledge bases for model training, gif creation challenges with GPT, deprecated plugin replacements, and the API's knack for struggling with certain word puzzles.


CUDA MODE Discord


Eleuther Discord


LAION Discord


Nous Research AI Discord


Stability.ai (Stable Diffusion) Discord


Modular (Mojo 🔥) Discord


Perplexity AI Discord

Perplexity API: Troubleshoot or Flustered?: Users are encountering 5xx and 401 errors when interfacing with the Perplexity AI's API, prompting discussions about the need for a status page and authentication troubleshooting.

Feature Wish List for Perplexity: Enthusiasts dissect Perplexity AI's current features such as image interpretation and suggest enhancements like artifact implementation for better management of files.

Comparing AI's Elite: The community analyzed and contrasted various AI models, notably GPT-4 Turbo, GPT-4o, and Claude 3.5 Sonnet; preferences were aired but no consensus emerged.

Perplexity's Search for Relevance: Shared Perplexity AI pages indicated interest in diverse topics ranging from mental health to the latest in operating systems, such as the performance boosts in Android 14.

AI in Journalism Ethics Crosshairs: An article criticized Perplexity for increasingly citing AI-generated content, sparking conversations about the reliability and privacy of AI-generated sources.


Latent Space Discord

Grab Figma AI while It's Hot: Figma AI is currently free for one year as shared by @AustinTByrd; details can be found in the Config2024 thread.

AI Engineer World Fair Woes: Members mentioned technical difficulties during an event at the AI Engineer World Fair, ranging from audio issues to screen sharing, and strategies such as leaving the stage and rejoining were suggested to resolve problems.

LangGraph Cloud Takes Off: LangChainAI announced LangGraph Cloud, a new service offering robust infrastructure for resilient agents, yet some engineers questioned the need for specialized infrastructure for such agents.

Conference Content Watch: AI Engineer YouTube channel is a go-to for livestreams and recaps of the AI Engineer World Fair, featuring key workshops and technical discussions for AI enthusiasts, while conference transcripts are available on the Compass transcript site.

Bee Buzzes with Wearables Update: Wearable tech discussions included innovative products like Bee.computer, which can perform tasks like recording and transcribing, and even offers an Apple Watch app, indicating the trend towards streamlined, multifunctional devices.


LM Studio Discord

LM Studio Lacks Critical Feature: LM Studio was noted to lack support for document-based training or RAG capabilities, emphasizing a common misunderstanding of the term 'train' within the community.

Code Models Gear Up: Claude 3.5 Sonnet received praise within the Poe and Anthropic frameworks for coding assistance, while there is anticipation for upcoming Gemma 2 support in LM Studio and llama.cpp.

Hardware Dependency Highlighted: Users discussed running DeepCoder V2 on high-RAM setups with good performance but noted crashes on an M2 Ultra Mac Studio due to memory constraints. Additionally, server cooling and AVX2 processor requirements for LM Studio were topics of hardware-related conversations.

Memory Bottlenecks and Fixes: Members shared their experiences with VRAM limitations when loading models in LM Studio, providing advice such as disabling GPU offload and upgrading to higher VRAM GPUs for better support.

Emerging AI Tooling and Techniques: There's buzz around Meta's new LLM Compiler models and integrating Mamba-2 into llama.cpp, showcasing advancement in AI tooling and techniques for efficiency and optimization.


LangChain AI Discord

Can't Print Streams Directly in Python: A user emphasized that you cannot print stream objects directly in Python and provided a code snippet showing the correct method: iterate over the stream and print each token's content.

Correctly Using LangChain for Relevant User Queries: There were discussions on improving vector relevance in user queries with LangChain, with potential solutions including keeping previous retrieval in chat history and using query_knowledge_base("Green light printer problem") functions.

Integrating LangChain with FastAPI and Retrieval Enhancements: Community members shared documentation and examples on building LangChain endpoints using add_routes in FastAPI, and optimizing the use of load_qa_chain() for server-side document provisioning.

Cutting-Edge Features of LangChain Expression Language: Insights into LangChain Expression Language (LCEL) were provided, highlighting async support, streaming, parallel execution, and retries, pointing to the need for comprehensive documentation for a full understanding.

New Tools and Case Studies for LangChain: Notable mentions include the introduction of Merlinn, an AI bot for troubleshooting production incidents, an Airtable of ML system design case studies, and the integration of security features into LangChain with ZenGuard AI. A YouTube tutorial was also highlighted, showing the creation of a no-code Chrome extension chatbot using Visual LangChain.


LlamaIndex Discord


Interconnects (Nathan Lambert) Discord


OpenRouter (Alex Atallah) Discord


Cohere Discord

Innovative API Strategies: Using the Cohere API, OpenRouter can engage in non-commercial usage without breaching license agreements; the community confirms that the API use circumvents the non-commercial restriction.

Command-R Model Sparks Exclusivity Buzz: The Command-R model, known for its advanced prompt-following capabilities, is available exclusively through OpenRouter for 'I'm All In' subscribers, sparking discussions around model accessibility and licensing.

Licensing Pitfalls Narrowly Avoided: Debate ensued regarding potential misuse of Command-R's licensing by SpicyChat, but members concluded that payments to Cohere should rectify any licensing issues.

Technical Troubleshooting Triumph: A troubleshooting success was shared after a member resolved a Cohere API script error on Colab and PyCharm by following the official Cohere multi-step tool documentation.

Rust Library Unveiled with Rewards Program: Rig, a new Rust library aimed at building LLM-powered applications, was introduced alongside a feedback program, rewarding developers for their contributions and ideas, with a nod to compatibility with Cohere's models.


OpenInterpreter Discord


OpenAccess AI Collective (axolotl) Discord


tinygrad (George Hotz) Discord

PyTorch's Rise Captured on Film: An Official PyTorch Documentary was shared, chronicling PyTorch’s development and the engineers behind its success, providing insight for AI enthusiasts and professionals.

Generic FPGA Design for Transformers: A guild member clarified their FPGA design is not brand-specific and can readily load any Transformer model from Huggingface's library, a notable development for those evaluating hardware options for model deployment.

Iterative Improvement on Tinygrad: Work on integrating SDXL with tinygrad is progressing, with a contributor planning to streamline the features and performance before opening a pull request, a point of interest for collaborators.

Hotz Hits the Presentation Circuit: George Hotz was scheduled for an eight-minute presentation, details of which were not disclosed, possibly of interest to followers of his work or potential collaborators.

Tinygrad Call for Code Optimizers: A $500 cash incentive was announced for enhancements to tinygrad's matching engine's speed, an open invitation for developers to contribute and collaborate on improving the project's efficiency.

Deep Dive into Tinygrad's Internals: Discussions included a request for examples of porting PyTorch's MultiheadAttention to tinygrad, a strategy to estimate VRAM requirements for model training by creating a NOOP backend, and an explanation of Shapetracker’s capacity for efficient data representation with reference to tinygrad-notes. These technical exchanges are essential for those seeking to understand or contribute to tinygrad's inner workings.


LLM Finetuning (Hamel + Dan) Discord


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (269 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (17 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (83 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #community-collaboration (2 messages):

Link mentioned: README_en.md · THUDM/glm-4-9b at main: no description found


HuggingFace ▷ #general (238 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

neuralink: nah i didnt on a break, i just didnt post


HuggingFace ▷ #cool-finds (3 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (12 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (1 messages):


HuggingFace ▷ #diffusion-discussions (2 messages):

<ul>
    <li><strong>Open Group Kickoff with Enthusiasm</strong>: A member started the discussion with an energetic “**Ghost!**”. This seemed to encourage further engagement in the channel.</li>
    <li><strong>Curiosity about Usage Impact</strong>: Another member, hayden_85058, asked, *"How do you feel about the effect of using it?"*. This indicates an interest in the practical outcomes or experiences from using a specific tool or method.</li>
</ul>

OpenAI ▷ #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (97 messages🔥🔥):


OpenAI ▷ #gpt-4-discussions (25 messages🔥):


OpenAI ▷ #prompt-engineering (53 messages🔥):


OpenAI ▷ #api-discussions (53 messages🔥):


CUDA MODE ▷ #general (11 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (46 messages🔥):

Links mentioned:


CUDA MODE ▷ #cool-links (1 messages):

gau.nernst: https://github.com/efeslab/Atom


CUDA MODE ▷ #torchao (16 messages🔥):

Links mentioned:


CUDA MODE ▷ #llmdotc (131 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #sparsity (1 messages):

Link mentioned: Accelerating Neural Network Training with Semi-Structured (2:4) Sparsity: Over the past year, we’ve added support for semi-structured (2:4) sparsity into PyTorch. With just a few lines of code, we were able to show a 10% end-to-end inference speedup on segment-anything by r...


Eleuther ▷ #general (27 messages🔥):

Links mentioned:


Eleuther ▷ #research (161 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (7 messages):


LAION ▷ #general (193 messages🔥🔥):

Links mentioned:


LAION ▷ #research (2 messages):


Nous Research AI ▷ #ctx-length-research (12 messages🔥):


Nous Research AI ▷ #off-topic (16 messages🔥):

Link mentioned: Tweet from Yaya Labs (@yaya_labs_): How about a tinder-like app for matching the ultra wealthy to their blood boys. Would you install ?


Nous Research AI ▷ #interesting-links (5 messages):

Links mentioned:


Nous Research AI ▷ #announcements (1 messages):

Link mentioned: NousResearch/Hermes-2-Pro-Llama-3-70B · Hugging Face: no description found


Nous Research AI ▷ #general (96 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (25 messages🔥):


Nous Research AI ▷ #rag-dataset (8 messages🔥):

Link mentioned: glaiveai/RAG-v1 · Datasets at Hugging Face: no description found


Nous Research AI ▷ #world-sim (11 messages🔥):

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (163 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (102 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):


Modular (Mojo 🔥) ▷ #ai (1 messages):


Modular (Mojo 🔥) ▷ #🔥mojo (23 messages🔥):

Link mentioned: Mojo - First Impression [Programming Languages Episode 29]: ►Full First Look Series Playlist: https://www.youtube.com/playlist?list=PLvv0ScY6vfd-5hJ47DNAOKKLLIHjz1Tzq►Find full courses on: https://courses.mshah.io/►Jo...


Modular (Mojo 🔥) ▷ #nightly (31 messages🔥):

Links mentioned:


Perplexity AI ▷ #general (137 messages🔥🔥):

Link mentioned: Garbage In, Garbage Out: Perplexity Spreads Misinformation From Spammy AI Blog Posts: As Perplexity faces criticism for allegedly plagiarizing journalistic work and distributing it like a media company, it is increasingly citing AI-generated blogs and LinkedIn posts riddled with inaccu...


Perplexity AI ▷ #sharing (8 messages🔥):


Perplexity AI ▷ #pplx-api (5 messages):


Latent Space ▷ #ai-general-chat (63 messages🔥🔥):

<ul>
  <li><strong>Figma AI is free for a year</strong>: According to <a href="https://x.com/austintbyrd/status/1806017268796854753?s=46&t=6FDPaNxZcbSsELal6Sv7Ug">@AustinTByrd</a>, <em>"Figma AI is free for a year before they start billing everyone."</em> Follow the link for full details: <a href="https://x.com/austintbyrd/status/1806017268796854753?s=46&t=6FDPaNxZcbSsELal6Sv7Ug">Config2024 thread</a>.</li>
  
  <li><strong>Conference talks now available via livestream</strong>: Recordings of non-livestreamed tracks like the RAG talks are still being awaited. Meanwhile, members can watch select livestreams on the <a href="https://youtube.com/@aidotengineer">AI Engineer YouTube channel</a>.</li>
  
  <li><strong>Compass transcript site shared</strong>: <a href="https://aie.compasswearable.com">Compass transcript site</a> was shared for viewing conference transcripts. These resources were mentioned to be useful and solid.</li>
  
  <li><strong>LangGraph Cloud launches</strong>: <a href="https://x.com/LangChainAI/status/1806371717084025165?t=15TNW0RaIb6EoIJ">@LangChainAI</a> launched <a href="http://bit.ly/langgraph-cloud-beta-1">LangGraph Cloud</a>, offering scalable infrastructure for fault-tolerant agents and integrated tracing & monitoring. However, some members questioned the necessity for specialized infrastructure for state machines.</li>
  
  <li><strong>Lots of wearable tech emerging</strong>: Discussions included new wearables like <a href="https://bee.computer/">Bee.computer</a> and their features like recording, transcription, and task execution. The service even offers an Apple Watch app, making extra devices optional.</li>
</ul>

Links mentioned:


Latent Space ▷ #llm-paper-club-west (70 messages🔥🔥):

- **Members discuss information processing amid AGI**: One member joked *"ngmi when AGI comes if you can only process one information stream at a time"*, expressing the importance of multitasking in the future. Another member humorously referred to the ongoing discussion as *"PEAK SCHIZO"*.

- **Technical difficulties during event presentation**: Multiple members reported issues with hearing and screen sharing during a live event. One member suggested an alternative to *"leave stage and come back"*, while another proposed sharing slides directly for a smoother presentation.

- **Planning and coordination for AI Engineer World Fair**: Members discussed logistics and coordination for an event, including ensuring hosts had necessary compasses and special instructions. A YouTube link [AI Engineer](https://www.youtube.com/@aiDotEngineer) was shared highlighting talks, workshops, and events for AI Engineers.

- **Recap request for AI Engineer Conference**: There was a request for a Sunday recap or summary of the AI Engineer Conference. Responses highlighted the challenge of managing multiple conferences and events simultaneously.

- **Managing event resources and logistics**: Members coordinated the availability of resources like poster boards and having founders ready for their sessions. Special instructions were given to ensure a seamless experience for presenters and guests, with updates on team reliance on transcripts and wearable tech sightings.

Link mentioned: AI Engineer: Talks, workshops, events, and training for AI Engineers.


LM Studio ▷ #💬-general (75 messages🔥🔥):

- **LM Studio lacks document training capabilities**: Members clarified that LM Studio does not support document-based training or RAG capabilities. A member highlighted, "When the majority say 'train' they mean feeding documents to an existing model."
  
- **AnythingLLM integrates with LM Studio for document summaries**: AnythingLLM supports various document types and generates concise summaries, integrating seamlessly with LM Studio. "It's completely free and open-source with no subscription required," shared a user.

- **Claude 3.5 Sonnet praised as top code model**: Community members expressed high praise for Claude 3.5 Sonnet, available on Poe and Anthropic, calling it their "new daily driver" for coding assistance.

- **Requirements for training Llama 3**: Discussions on training Llama 3 highlighted that significant hardware investment is required, particularly for the 70B model. "You'll see the majority of them are trained on rented 8xH100 GPU clusters," explained a user.

- **Gemma 2 support in progress**: Members shared updates on upcoming support for Gemma 2 in LM Studio and llama.cpp. "I know that lmstudio devs are working on getting a release of lmstudio out ASAP with gemma 2 support," mentioned a user.

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (23 messages🔥):

Links mentioned:


LM Studio ▷ #⚙-configs-discussion (1 messages):


LM Studio ▷ #🎛-hardware-discussion (17 messages🔥):


LM Studio ▷ #🧪-beta-releases-chat (1 messages):

Link mentioned: llama : support Mamba-2 · Issue #7727 · ggerganov/llama.cpp: Mamba-2 is a new version of the Mamba architecture: Blog: https://tridao.me/blog/2024/mamba2-part1-model/ Paper: https://arxiv.org/abs/2405.21060


LM Studio ▷ #open-interpreter (1 messages):


LM Studio ▷ #🛠-dev-chat (1 messages):

- **Extracting Token Generation Data Using Python**: A member inquired on how to utilize Python to retrieve data from the local LM Studio server. They are specifically interested in the **speed of tokens** and the **time taken for the generation of the first token**.

LangChain AI ▷ #general (28 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (70 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (4 messages):

Links mentioned:


LangChain AI ▷ #tutorials (1 messages):

emarco: https://www.youtube.com/watch?v=Q_yKRLACx78&t=1s


LlamaIndex ▷ #blog (2 messages):

Link mentioned: LlamaCloud Waitlist: Thanks for your interest in LlamaCloud! Sign up and tell us below which email address you used, we'll be letting people in at a measured pace.


LlamaIndex ▷ #general (56 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (13 messages🔥):

- **OpenAI API outsells Microsoft on Azure**: OpenAI now generates more revenue from API sales than Microsoft does from reselling it on Azure. This news was shared via a tweet by Aaron P. Holmes, highlighting a surprising turn in the market dynamics. [Source](https://x.com/aaronpholmes/status/1806312654505443347?s=46)
- **Meta releases LLM Compiler for code optimization**: Meta has introduced the **Meta Large Language Model Compiler**, geared towards compiler optimization tasks using pre-trained models. The suite focuses on LLVM-IR and assembly code, leveraging a vast corpus of 546 billion tokens. [Research Publication](https://ai.meta.com/research/publications/meta-large-language-model-compiler-foundation-models-of-compiler-optimization/)
- **Character.AI launches Character Calls**: Character.AI has launched **Character Calls**, allowing users to have voice conversations with AI characters. The feature is accessible via their app and aims to create more immersive AI experiences but received mixed reviews on its performance and fluidity. [Blog Post](https://blog.character.ai/introducing-character-calls/)

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (15 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (23 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #posts (3 messages):


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (39 messages🔥):

Links mentioned:


Cohere ▷ #general (23 messages🔥):

Links mentioned:


Cohere ▷ #project-sharing (2 messages):


OpenInterpreter ▷ #general (15 messages🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (5 messages):


OpenAccess AI Collective (axolotl) ▷ #general (13 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (1 messages):


OpenAccess AI Collective (axolotl) ▷ #community-showcase (2 messages):

Links mentioned:


tinygrad (George Hotz) ▷ #general (8 messages🔥):

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (5 messages):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #general (2 messages):


LLM Finetuning (Hamel + Dan) ▷ #🟩-modal (1 messages):

__dchx: was there an answer to your question <@610008277714993152> ?


LLM Finetuning (Hamel + Dan) ▷ #langsmith (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #workshop-2 (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #jeremy_python_llms (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #fireworks (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #predibase (1 messages):

lalithnarayan: Pinged you on DM.. please could you take a look 🙏


LLM Finetuning (Hamel + Dan) ▷ #career-questions-and-stories (3 messages):

Link mentioned: Extension | Cursor - The AI-first Code Editor: no description found


AI Stack Devs (Yoko Li) ▷ #ai-town-discuss (4 messages):


Torchtune ▷ #general (2 messages):


DiscoResearch ▷ #embedding_dev (1 messages):

le_mess: good work 🙂 Would you mind sharing the training code?







{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}