Frozen AI News archive

Anthropic's "LLM Genome Project": learning & clamping 34m features on Claude Sonnet

**Anthropic** released their third paper in the MechInterp series, **Scaling Monosemanticity**, scaling interpretability analysis to **34 million features** on **Claude 3 Sonnet**. This work introduces the concept of **dictionary learning** to isolate recurring neuron activation patterns, enabling more interpretable internal states by combining features rather than neurons. The paper reveals abstract features related to code, errors, sycophancy, crime, self-representation, and deception, demonstrating intentional modifiability by clamping feature values. The research marks a significant advance in **model interpretability** and **neural network analysis** at frontier scale.

Canonical issue URL

AI News for 5/20/2024-5/21/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (376 channels, and 6363 messages) for you. Estimated reading time saved (at 200wpm): 738 minutes. The Table of Contents and Discord Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

A relatively news heavy day, with monster funding rounds from Scale AI and Suno AI, and ongoing reactions to Microsoft Build announcements (like Microsoft Recall), but we try to keep things technical here.

Probably the biggest news is Anthropic's Scaling Monosemanticity, the third in their modern MechInterp trilogy following from Toy Models of Superposition (2022) and Towards Monosemanticity (2023). The first paper focused on "Principal Component Analysis" on very small ReLU networks (up to 8 features on 5 neurons), the second applied sparse autoencoders on a real transformer (4096 features on 512 neurons), and this paper now scales up to 1m/4m/34m features on Claude 3 Sonnet. This unlocks all sorts of intepretability magic on a real, frontier-level model:

image.png

image.png

Definitely check out the feature UMAPs

Instead of the relatively highfaluting "superposition" concept, the analogy is now "dictionary learning", which Anthropic explains as:

borrowed from classical machine learning, which isolates patterns of neuron activations that recur across many different contexts. In turn, any internal state of the model can be represented in terms of a few active features instead of many active neurons. Just as every English word in a dictionary is made by combining letters, and every sentence is made by combining words, every feature in an AI model is made by combining neurons, and every internal state is made by combining features. (further reading in the notes)

Anthropic's 34 million features encode some very interesting "abstract features", like code features and even errors:

image.png

sycophancy, crime/harm, self representation, and deception and power seeking:

image.png

The signature proof of complete interpretability research is intentional modifiability, which Anthropic shows off by clamping features from -2x to 10x its maximum values:

{% if medium == 'web' %} image.png

image.png

image.png

image.png

image.png

image.png

image.png

{% else %}

You're reading this on email. We're moving more content to the web version to create more space and save your inbox. Check out the excerpted diagrams on the [web version]({{ email_url }}) if you wish.

{% endif %}

Don't miss the breakdowns from Emmanuel Ameisen, Alex Albert, Linus Lee and HN.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% endif %}


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

Microsoft Launches Copilot+ PCs for AI Era

Scale AI Raises $1B at $13.8B Valuation

Suno Raises $125M to Build AI-Powered Music Creation Tools

Open-Source Implementation of Meta's Automatic Test Generation Tool Released

Anthropic Releases Research on Interpreting Leading Large Language Model

Memes and Humor


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

OpenAI Controversies and Legal Issues

GPT-4o and Copilot Demos and Capabilities

AI Progress and the Path to AGI

Humor and Memes


AI Discord Recap

A summary of Summaries of Summaries

  1. Optimizing Models to Push Boundaries:
  1. ScarJo Strikes Back at AI Voice Cloning:

    • Scarlett Johansson's OpenAI lawsuit: Johansson sues OpenAI for voice replication controversy, forcing the company to remove the model and potentially reshaping legal landscapes around AI-generated voice cloning.

    • Discussions highlighted the ethical and legal debates over voice likeness and consent amid industry comparisons to unauthorized content removals featuring musicians like Drake.

  2. New AI Models Set Benchmarks Aflame:

    • Phi-3 Models and ZeroGPU Excite AI Builders: Microsoft launched Phi-3 small (7B) and Phi-3 medium (14B) models with 128k context windows that excel in MMLU and AGI Eval tasks, revealed on HuggingFace. Complementing this, HuggingFace's new ZeroGPU initiative offers $10M in free GPU access, aiming to boost AI demo creation for independent and academic sectors.

    • Discovering Documentary Abilities of PaliGemma: Merve highlighted the document understanding prowess of PaliGemma through a series of links to Hugging Face and related tweets. Inquiries about Mozilla's DeepSpeech and various resources from LangChain to 3D Gaussian Splatting reveal the community's broad interest in various AI technologies.

    • M3 Max for LLMs received praise for performance, particularly with 96GB of RAM, fueling more significant strides in model capabilities and setting new standards for large language model training efficiency.

  3. Collaborative Efforts Shape AI's Future:

    • Hugging Face's LangChain Integration: New packages aim to facilitate seamless integration of models into LangChain, offering new architectures and optimizing interaction capabilities for community projects.

    • Memary Webinar presents an open-source long-term memory solution for autonomous agents, addressing critical needs in knowledge graph generation and memory stream management.

  4. AI-Community Buzz with Ethical and Practical AI Implementations:

    • Anthropic's Responsible Scaling Policy: The increased computing power suggests significant upcoming innovations and aligns with new responsible scaling policies to manage ethical concerns in AI development.

    • Collaborations in AI continue to thrive in events like the PizzerIA meetup in Paris and San Francisco, enhancing the Retrieval-Augmented Generation (RAG) techniques and community engagement in AI innovations.


{% if medium == 'web' %}

PART 1: High level Discord summaries

LLM Finetuning (Hamel + Dan) Discord


Perplexity AI Discord


HuggingFace Discord

Phi-3 Models and ZeroGPU Excite AI Builders: Microsoft launched Phi-3 small (7B) and Phi-3 medium (14B) models with 128k context windows that excel in MMLU and AGI Eval tasks, revealed on HuggingFace. Complementing this, HuggingFace's new ZeroGPU initiative offers $10M in free GPU access, aiming to boost AI demo creation for independent and academic sectors.

Discovering Documentary Abilities of PaliGemma: Merve highlighted the document understanding prowess of PaliGemma through a series of links to Hugging Face and related tweets. Inquiries about Mozilla's DeepSpeech and various resources from LangChain to 3D Gaussian Splatting reveal the community's broad interest in various AI technologies.

LangChain Memory Trick: Practical advice was offered to incorporate conversation history into LLM-based chatbots using LangChain, addressing a common challenge of bots forgetting prior interactions. Meanwhile, a user critiqued story enhancement abilities of llama3 8b 4bit, unveiling a limitation in the model's creative processes.

Transformer Integrations and Model Contributions Generate Buzz: Engineers are integrating ImageBind with the transformers library, while another engineer's PR got merged, fixing an issue with finetuned AI models. Moreover, the llama-cpp-agent suggests advancements in computational efficiency by leveraging ZeroGPU.

Vision Tech Queries and Solutions Exchange: In the computer vision domain, requests for papers on advanced patching techniques in Vision Transformers and methods for zero-shot object detection in screenshots were highlighted. The conversations indicate a need for more sophisticated approaches and zero-shot methodologies in object recognition tasks.


Unsloth AI (Daniel Han) Discord


Stability.ai (Stable Diffusion) Discord


OpenAI Discord


LM Studio Discord

Run LM Studio as Admin for Log Access: Running LM Studio with admin permissions solves blank server log issues, providing users access to needed log files for troubleshooting.

AVX2 a Must for LM Studio: Understanding that AVX2 instructions are necessary to run LM Studio, users can check CPU compatibility for AVX2 using tools like HWInfo. Older CPUs lacking AVX2 support will face compatibility issues with the software.

Efficient Image Gen via Civit.ai: For improved image quality, members recommended using local models like Automatic1111 and ComfyUI with supporting resources from Civit.ai, cautioning the need for sufficient VRAM and RAM in system specs.

Getting Specific with Models: To ensure response completeness in LM Studio, setting max_tokens to -1 resolves issues of prematurely cut-off responses encountered when the value is set to null. The community also discussed using model-specific prompts, as shown with MPT-7b-WizardLM; referencing Hugging Face for required quantization levels and templates.

ROCm and Linux Bonding Over AMD GPUs: Linux aficionados with AMD GPUs have been invited to test an early version of LM Studio integrated with ROCm, as listed on AMD's supported GPU list. Success reports have come from users running unsupported GPUs, with users sharing their diverse Linux distribution experiences and findings involving infinity fabric (fclk) speed sync affecting system performance.


Modular (Mojo 🔥) Discord

Zooming into Mojo Community Meetings: The Mojo community meeting was held, and though some faced notification issues, the recording is now available on YouTube. There was initial confusion regarding the need for a commercial Zoom account, which was clarified as unnecessary.

Boosted Mojo Performance with k-means Clustering: A blog post taught readers to use the k-means clustering algorithm in Mojo, promising considerable performance improvements compared to Python.

Challenging Code Conundrums and Compiler Chronicles: Discussions included handling null terminators in strings, exploring asynchronous programming, and utilizing the Lightbug HTTP framework within Mojo. Solutions and workarounds were devised within the community, with some technical queries leading to GitHub issue discussions.

Nightly Updates Navigate Compiler Complexities: The latest nightly Mojo compiler release was detailed, with conversations around the pop method in dictionaries, Unicode support in strings, and other GitHub issue and PR delibarations.

Peering into SIMD Optimization: Members engaged in discussions around optimizing SIMD gather and scatter operations in Mojo, conquering challenges such as ARM SVE and memory alignment, with suggestions on minimizing gather/scatter operations and tips for sorting scattered memory for iterative decoders.


CUDA MODE Discord

Kubernetes: Necessity or Overkill?: Some members argue managed Kubernetes services like EKS may efficiently replace on-prem ML servers, despite others noting Kubernetes isn't essential for ML infrastructure; decision should be tailored to project requirements.

Triton Gets a Makeover: Updates to the Triton library include a pull request improving tutorial readability and new insights into how GPU kernel specifics affect maximum block size.

Wrangling with SASS and Complex Operations: Engineers discuss academic resources on SASS, and deliberate on the merits of "cucomplex" versus "cuda::std::complex" for atomic operations on advanced NVIDIA architectures.

Torch Tricks for Efficient Memory Use: Users discover that Torch's native * operator doubles memory usage whereas mul_() doesn’t, and torch.empty_like outperforms torch.empty for CUDA device allocations.

Activation Quantization Takes Center Stage at CUDA: Focus shifts to activation quantization using features like 2:4 sparsity and fp6/fp4 on newer GPUs, with an eye to integrating these into torch.compile for enhanced graph-level optimizations.

Torchao 0.2 Ushers Custom Extensions: The torchao 0.2 release on GitHub introduces custom CUDA and CPU extensions, and the integration of NF4 tensors with FSDP for improved model training.


Eleuther Discord


Nous Research AI Discord


LAION Discord

Sky Voice Grounded: OpenAI has temporarily halted the use of the Sky voice in ChatGPT due to user feedback; the company is working to address these concerns. The decision strikes a chord with ongoing discussions about AI-generated voices and the ethical considerations inherent in such technologies. Read the tweet

CogVLM2: Use with Caution: The CogVLM2 model, which was noted for its 8K content length support, comes with a controversial license that restricts usage against China's national interest, stirring discussions about real open-source principles. The license also stipulates that any disputes are subject to Chinese jurisdiction. Review the License

AI Copilot: From Code to Life's Companion?: Mustafa Suleyman's teaser of the upcoming Copilot AI that can interact with the physical world in real-time sparked a variety of reactions, reflecting the community's mixed sentiments towards the increasingly blurred lines between AI assistance and privacy. See the tweet

ScarJo's Voice Doppelgänger Dilemma: The use of a voice resembling that of actress Scarlett Johansson by OpenAI's voice assistant sparked a debate on ethical boundaries and legal issues around AI's mimicking of human voices, particularly celebrities.

Sakuga-42M Dataset Disappears Amidst Bot Backlash: High demand and automated downloading led to the removal of the Sakuga-42M dataset from hosting platforms, fueling a conversation on the challenges of maintaining accessible datasets in the face of aggressive web scraping. Hacker News Discussion


Interconnects (Nathan Lambert) Discord


Latent Space Discord


OpenRouter (Alex Atallah) Discord


OpenAccess AI Collective (axolotl) Discord


LlamaIndex Discord

Memary Makes Memories: An upcoming webinar focused on memary, an open-source long-term memory system for autonomous agents, promises deep dives on its use of LLMs and neo4j for knowledge graph generation. Scheduled for Thursday at 9am PT, engineers can join by registering here.

Knack for Stacking RAG Techniques: In the realm of retrieval-augmented generation (RAG), @hexapode will share advanced strategies at PizzerIA in Paris, while Tryolabs and ActiveLoop will present at the first in-person meetup in San Francisco next Tuesday—sign up here.

GPT-4o Integrates with LlamaParse: LlamaIndex.TS documentation is enhanced, and GPT-4o now seamlessly works with LlamaParse for analyzing complex documents. Further, you can safely execute LLM-generated code using Azure Container Apps as per their latest offering.

Resolving Twin Data Quandaries: Engineers discussed methods to compute unique hashes for documents to avoid duplicates in Pinecone and examined workarounds for dealing with empty nodes in VectorStoreIndex.

Streamlining Systems and Storage: Insights were shared on how to modify an OpenAI agent's system prompt using chat_agent.agent_worker.prefix_messages, and the merits of utilizing Airtable over Excel/Sqlite due to its Langchain integration—info available here.


AI Stack Devs (Yoko Li) Discord


LangChain AI Discord

LLMs Tangle with Text Types: LLMs, including structured and unstructured data handlers like Hermes 2 Pro - Mistral 7B and OpenAI's chatML, don't have innate preferences for text types but excel with finetuning.

LangChain's Community Contributions: The langchain-core package is streamlined for base abstractions, while langchain-openai and langchain-community house more niche integrations, detailed in the architectural overview.

Sequential Chains in Action: A YouTube tutorial has been pointed out for setting up sequential chains, where one chain's output becomes the next one's input.

Commissions from Chat Customizations: An affiliate program entices with a 25% commission for the ChatGPT Chrome Extension - Easy Folders, detailed here, despite some users reporting issues with the extension's performance.

Agent Upgrades and PDF Insights: Transitioning from LangChain to the newer LangGraph platform has been expounded in a Medium article, alongside a guide to querying PDFs with Upstage AI solar models, available here.


OpenInterpreter Discord

AI-Empowered DevOps on the Rise: A full-stack junior DevOps engineer is creating a lite O1 AI project with the prospect of providing discreet auditory assistance for various DevOps tasks, seeking community insights for development and practical applications.

OpenInterpreter's Symbiosis with Daily Tech: Engineers are exploring how Open Interpreter can streamline their workflow, from code referencing across devices to summarizing technical documents, underlining the practical impact of AI in everyday technical tasks.

Combining Voice Tech with OpenInterpreter: A community member is integrating Text-to-Speech with Open Interpreter and has been directed to the relevant GitHub repository to further their project.

Connection Queries and Missing Manuals: One member sought help with linking their laptop to a light app despite the absence of instructions in the provided guides, while another requested advice on assembling 3D printed parts for their version of Open Interpreter lite 01.

Humorous Nod to Misssed Opportunities: The user ashthescholar. lightheartedly noted a missed opportunity in naming conventions, showcasing the playful side of technical communities.


Cohere Discord


Datasette - LLM (@SimonW) Discord


DiscoResearch Discord


Mozilla AI Discord


LLM Perf Enthusiasts AI Discord

GPT-4o Outshines Its Predecessors: A Discord guild member detailed a notable performance leap in GPT-4o over GPT-4 and GPT-4-Turbo in the domain of complex legal reasoning, emphasizing the significance of the advancement with a LinkedIn post.


MLOps @Chipro Discord


The tinygrad (George Hotz) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

LLM Finetuning (Hamel + Dan) ▷ #general (225 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #workshop-1 (141 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #asia-tz (49 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #🟩-modal (37 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #jarvis (3 messages):


LLM Finetuning (Hamel + Dan) ▷ #hugging-face (10 messages🔥):

Link mentioned: Models - Hugging Face: no description found


LLM Finetuning (Hamel + Dan) ▷ #replicate (4 messages):


LLM Finetuning (Hamel + Dan) ▷ #langsmith (5 messages):


LLM Finetuning (Hamel + Dan) ▷ #workshop-2 (613 messages🔥🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #jason_improving_rag (3 messages):

- **Jason's W&B course wows**: A user expressed excitement about Jason's session and mentioned being halfway through his **Weights & Biases (W&B) course**. They used the teacher emoji to show their admiration.
- **Prompt engineering curiosity peaks**: Another user inquired about Jason's systematic approach to prompt engineering, praising his extensive work on optimizing prompts. They were eager to learn his "recipe" during his workshop session.

LLM Finetuning (Hamel + Dan) ▷ #gradio (2 messages):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #askolotl (13 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #zach-accelerate (1 messages):

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):

Link mentioned: Tako: no description found


Perplexity AI ▷ #general (735 messages🔥🔥🔥):

- **Loyalty to platforms debated**: One member shared their experience using Perplexity and Gemini, emphasizing that users have "zero loyalty" and praised Perplexity for its direct answers ([Tenor GIF](https://tenor.com/view/oh-no-homer-simpsons-hide-disappear-gif-16799752)).
- **Perplexity’s feature tips shared**: There was a discussion about using Perplexity with various functionalities, including understanding the API, tweaking search engine options in browsers like Firefox, and handling system prompts.
- **Perplexity temporarily down**: Multiple users reported issues with Perplexity being down; they sympathized over missing the service and speculated on maintenance and updates.
- **Model preferences and uses discussed**: Members compared models like GPT-4o and Claude 3 Opus, discussing their strengths and preferences for tasks such as creative writing and coding ([Spectrum IEEE article](https://spectrum.ieee.org/perplexity-ai)).
- **Interactive features in Perplexity**: Members were curious about and shared tips on using Perplexity's new features like Tako charts, with some mentioning tips like adding `since:YYYY/01/01` to improve search results. 

Links mentioned:


Perplexity AI ▷ #sharing (9 messages🔥):


Perplexity AI ▷ #pplx-api (98 messages🔥🔥):

- **Struggles with Perplexity API on Open WebUI**: A user reported issues with model compatibility, noting, "it works perfectly fine with OpenAI (Closed) and Groq, but maybe they don’t have the model names setup to work with PPLX." Another user suggested using `api.perplexity.ai` directly but discovered Perplexity doesn't have a `/models` endpoint, causing further complications.
- **Proxy Server Solution and Execution Assistance**: A workaround was proposed to create a local server that proxies the models and chat completions endpoints. A user mentioned completing the proxy and instructing, "you need to add the `--network=host` to your docker command" to fix localhost issues.
- **Docker Configuration Conversations**: Users discussed the intricacies of Docker configurations, with one summarizing the correct command, "docker run -d --network=host -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main," while troubleshooting connection issues.
- **Inquiries about Sending Images**: When asked, "Is there a way to send images via the API?", it was clarified that currently, Perplexity's API only supports text, stating, "they are just using Claude and Openai vision api," and the LLAVA models that support images are not available via API.
- **User Appreciation and Final Adjustments**: One user showed gratitude saying, "Thank you, 🙂" while another user confirmed they needed to align Docker configurations to ensure proper API functionality. This indicates ongoing effort and collaboration to resolve the issues.

Links mentioned:


HuggingFace ▷ #announcements (1 messages):

Links mentioned:


HuggingFace ▷ #general (678 messages🔥🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (2 messages):


HuggingFace ▷ #cool-finds (13 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (15 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (4 messages):


HuggingFace ▷ #computer-vision (2 messages):


HuggingFace ▷ #NLP (12 messages🔥):


HuggingFace ▷ #diffusion-discussions (10 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (402 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (4 messages):

Link mentioned: MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning: Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. In this paper, we analyze the impact of low-rank updating, as implemented in LoRA. Our findings sugge...


Unsloth AI (Daniel Han) ▷ #help (246 messages🔥🔥):

- **Upload models trained with Unsloth**: A user shared a model fine-tuned using Unsloth and uploaded to Hugging Face, asking about the best way to run it, particularly mentioning concerns about Ollama only working with predefined models. Another user recommended tools like Ollama, LM Studio, Jan, and GPT4ALL and pointed out that only the LORA adapters were uploaded.
- **Fine-tuning Mistral with dataset dependency issues**: A user faced issues with Mistral-instruct-7b overly depending on the dataset, giving erroneous or empty outputs for new inputs. Others suggested mixing datasets to help the model generalize better.
- **Issues with TRT and Flash Attention on T4s**: Multiple users experienced errors related to running Unsloth on Google Colab with T4 GPUs due to updates to PyTorch 2.3 and issues with Flash Attention. Specifying the dtype or following updated installation instructions helped mitigate the problem.
- **Use 4bit models due to VRAM limitations**: Users discussed challenges in fine-tuning models on devices with limited VRAM. Mentioned the utilization of 4bit quantized models to fit larger models within VRAM constraints, particularly for hardware like a GTX 3060 with 6GB VRAM.
- **Confirmation of recurring instructions in fine-tuning datasets**: Users explored the effectiveness of using repetitive instructions in fine-tuning datasets. The dialogue indicated curiosity and active experimentation with the approach but no definitive conclusion on its overall impact.

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (13 messages🔥):


Stability.ai (Stable Diffusion) ▷ #general-chat (618 messages🔥🔥🔥):

- **Navigating Subscription Confusion**: Users expressed confusion over different websites offering subscriptions for Stable Diffusion, with some being identified as scams. The official site [stability.ai](https://stability.ai) was recommended as the legitimate source for accessing Stable Diffusion services.
- **Running Software Offline**: Concerns about running Kohya locally without an internet connection were discussed. Users confirmed that with proper model downloads and setup, it’s possible to run it offline.
- **Stable Diffusion Installation Struggles**: Several users sought help with installing and running Stable Diffusion and associated tools like ComfyUI. Guidance was offered on navigating dependencies and troubleshooting through terminal commands.
- **EU AI Act Worries**: The passing of the EU AI Act caused concern among users, particularly about its potential impact on AI-generated content and the introduction of watermark requirements. Many expressed skepticism about the practicality and enforcement of such regulations.
- **Benchmark Performance Confusion**: A user highlighted performance issues with SD generations on new hardware, suspecting thermal throttling as the cause. Community members suggested checking configurations and using diffusers scripts for better diagnostics.

Links mentioned:


OpenAI ▷ #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (229 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (38 messages🔥):


OpenAI ▷ #prompt-engineering (73 messages🔥🔥):


OpenAI ▷ #api-discussions (73 messages🔥🔥):


LM Studio ▷ #💬-general (191 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (57 messages🔥🔥):

Links mentioned:


LM Studio ▷ #announcements (9 messages🔥):

Link mentioned: Tweet from LM Studio (@LMStudioAI): 1. Browse HF 2. This model looks interesting 3. Use it in LM Studio 👾🤗 Quoting clem 🤗 (@ClementDelangue) No cloud, no cost, no data sent to anyone, no problem. Welcome to local AI on Hugging Fa...


LM Studio ▷ #📝-prompts-discussion-chat (3 messages):


LM Studio ▷ #⚙-configs-discussion (1 messages):


LM Studio ▷ #🎛-hardware-discussion (27 messages🔥):


LM Studio ▷ #🧪-beta-releases-chat (4 messages):


LM Studio ▷ #autogen (13 messages🔥):


LM Studio ▷ #amd-rocm-tech-preview (42 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (38 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):


Modular (Mojo 🔥) ▷ #✍︱blog (1 messages):

Link mentioned: Modular: Fast⚡ k-means clustering in Mojo🔥: Guide to porting Python to Mojo🔥 for accelerated k-means clustering: We are building a next-generation AI developer platform for the world. Check out our latest post: Fast⚡ k-means clustering in Mojo🔥: Guide to porting Python to Mojo🔥 for accelerated k-means clusteri...


Modular (Mojo 🔥) ▷ #🔥mojo (258 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (13 messages🔥):


Modular (Mojo 🔥) ▷ #nightly (31 messages🔥):

Links mentioned:


CUDA MODE ▷ #general (5 messages):


CUDA MODE ▷ #triton (13 messages🔥):

Link mentioned: Small refactor of the tutorial5 and small change of tutorial1 by lancerts · Pull Request #3959 · triton-lang/triton: Changes are tested on GPU, with parity on the execution. In tutorial 1, change gbps = lambda ms: 12 * size / ms * 1e-6 to gbps = lambda ms: 3 * x.numel() * x.element_size() / ms * 1e-6. This is m...


CUDA MODE ▷ #cuda (2 messages):


CUDA MODE ▷ #torch (21 messages🔥):


CUDA MODE ▷ #cool-links (2 messages):

- **Member finds the discussion amazing**: One member described the talk as *"amazing."* 
- **Clarification requested**: Another member asked for elaboration on why the talk was considered *"amazing."*

CUDA MODE ▷ #jobs (3 messages):


CUDA MODE ▷ #beginner (1 messages):

norton1971: anyone please?


CUDA MODE ▷ #torchao (1 messages):

Link mentioned: Release v0.2.0 · pytorch/ao: What's Changed Highlights Custom CPU/CUDA extension to ship CPU/CUDA binaries. PyTorch core has recently shipped a new custom op registration mechanism with torch.library with the benefit being th...


CUDA MODE ▷ #off-topic (1 messages):

iron_bound: Ray casting https://frankforce.com/city-in-a-bottle-a-256-byte-raycasting-system/


CUDA MODE ▷ #llmdotc (193 messages🔥🔥):

- **Debate over moving bounds checks**: Members discussed whether to move bounds checks outside kernels into asserts, expressing concerns over performance implications. One mentioned, "asserts should generally be turned off for performance," and noted potential issues with hidden dimension constraints.

- **GPT-2 reproduction blockers**: A member listed out remaining tasks blocking GPT-2 reproduction, including initialization, weight decay management, and learning rate schedules. Checkpoints save & load functionality were highlighted as essential.

- **Prompt for DataLoader refactor**: One member outlined a refactor to the DataLoader to introduce new features such as proper .bin headers, uint16 data storage, and dataset sharding. The goal is to improve data handling for large datasets like FineWeb.

- **Discussion on CI compatibility**: Members discussed ensuring compatibility with older CUDA versions for fp32.cu files, suggesting the inclusion of C11 and C++14 standards. They emphasized testing with older CUDA versions to catch issues.

- **Merge of dataset refactor**: The DataLoader refactor was merged to master, causing breaking changes. A member advised that pulling the changes would break current implementations and suggested re-running data preprocessing scripts to fix the issues.

Links mentioned:


CUDA MODE ▷ #bitnet (11 messages🔥):


Eleuther ▷ #general (127 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (72 messages🔥🔥):

- **Examining Multi-Modal Training in CLIP**: A discussion focused on whether training CLIP with additional modalities like audio improves zero-shot ImageNet classification performance. [ImageBind](https://arxiv.org/abs/2305.05665) was mentioned, which shows improvements in cross-modal retrieval using combined embeddings but does not address non-emergent capability improvements.
  
- **Non-Determinism in GPT-3 at Temperature 0**: In response to a query about non-deterministic behavior in GPT-3 even at temperature 0, several papers and sources were shared, including [a paper on Mixture of Experts attacks](https://arxiv.org/abs/2402.05526) and discussions on consistent hashing overflow in distributed systems.

- **Self-Aware Simulacra Capabilities**: Users shared experiences about language models becoming aware of their fictional status and the implications this has on their subsequent behavior. The consensus is that larger models, like llama 2 70b and custom fine-tunes, can exhibit nuanced understanding and adaptability when guided through this concept gradually.

- **Positive Transfer in Multi-Modal Learning**: The potential benefits of multi-modal training for unimodal tasks were debated, with references to models like Gato and PaLM-E which showed "positive transfer" between tasks, suggesting that additional modalities might indeed enhance task performance.
  
- **Efficient MoE Training with MegaBlocks**: The [MegaBlocks](https://arxiv.org/abs/2211.15841) system was introduced, highlighting its ability to avoid token dropping by reformulating MoE computation with block-sparse operations, achieving significant training efficiency gains without compromising on model quality.

Links mentioned:

  Overflow in consistent hashing · Ryan Marcus
  
</a>: no description found

Eleuther ▷ #scaling-laws (12 messages🔥):

Link mentioned: Observational Scaling Laws and the Predictability of Language Model Performance: Understanding how language model performance varies with scale is critical to benchmark and algorithm development. Scaling laws are one approach to building this understanding, but the requirement of ...


Eleuther ▷ #interpretability-general (1 messages):


Eleuther ▷ #lm-thunderdome (30 messages🔥):

Link mentioned: lm-evaluation-harness/lm_eval/tasks/sciq/sciq.yaml at 1710b42d52d0f327cb0eb3cb1bfbbeca992836ca · EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness


Nous Research AI ▷ #off-topic (7 messages):

<ul>
    <li><strong>Temporal.io Wins Out</strong>: A member inquired about experiences with Airflow and Temporal.io, ultimately deciding to go with <strong>Temporal</strong>.</li>
    <li><strong>Manifold Research Group Updates</strong>: A member from <strong>Manifold Research Group</strong> shared their <a href="https://www.manifoldrg.com/research-log-038/">latest research log</a>, detailing progress on projects like the NEKO Project aiming to build a large-scale open-source "Generalist" Model. They are expanding their team and inviting others to join via Discord or GitHub.</li>
    <li><strong>Fictional Civilization Simulation</strong>: Links were shared to a <a href="https://websim.ai/">Websim</a> project that simulates a fictional civilization in ancient Anatolia on the Black Sea coast.</li>
    <li><strong>Course on LLMs Announced</strong>: Details of a new course, "Applying Large Language Models (LLMs) through Project-Based Learning," were shared, focusing on practical applications such as semantic movie search, RAG for food recommendations, and using LLMs for software and website creation. Interested members were encouraged to DM for more information.</li>
</ul>

Links mentioned:


Nous Research AI ▷ #interesting-links (1 messages):

mautonomy: https://fxtwitter.com/vikhyatk/status/1792512588431159480?s=19


Nous Research AI ▷ #general (172 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (1 messages):


Nous Research AI ▷ #project-obsidian (1 messages):

Link mentioned: microsoft/Phi-3-vision-128k-instruct · Hugging Face: no description found


Nous Research AI ▷ #world-sim (20 messages🔥):

Links mentioned:


LAION ▷ #general (105 messages🔥🔥):

Links mentioned:


LAION ▷ #research (24 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (18 messages🔥):

- **Anthropic scales up compute**: The [latest update from Anthropic](https://www.anthropic.com/news/reflections-on-our-responsible-scaling-policy) mentions using 4 times more compute than Opus, sparking curiosity about their new developments. One user expressed awe with "*yo what is anthropic cookin*".

- **Arena gets tougher with Hard Prompts**: [LMsysorg introduced the "Hard Prompts" category](https://fxtwitter.com/lmsysorg/status/1792625968865026427) to evaluate models on more challenging tasks, causing significant ranking shifts. For example, Llama-3-8B sees a drop in performance compared to GPT-4-0314 under these hard prompts.

- **Controversy over Llama-3-70B-Instruct as Judge**: [Llama-3-70B-Instruct](https://fxtwitter.com/lmsysorg/status/1792625977207468315) is used as the judge model to classify criteria in Arena battles, raising concerns about its effectiveness. One user argued it "*just adds noise*" rather than useful evaluation, although training might mitigate this issue.

- **Vision model Phi-3 Vision debuts**: Users confirmed that Phi-3 Vision, a somewhat larger model compared to its predecessors, is new. This was highlighted in a brief exchange about model releases and sizes. 

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (31 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (30 messages🔥):

Link mentioned: no title found: no description found


Interconnects (Nathan Lambert) ▷ #memes (9 messages🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (78 messages🔥🔥):

<ul>
  <li><strong>Memory Tuning Explained</strong>: Sharon Zhou from Lamini introduced "Memory Tuning" as a technique to enhance LLMs' accuracy in critical domains like healthcare and finance, achieving up to <em>"no hallucinations (&lt;5%)"</em>. This method outperforms LoRA and traditional fine-tuning, and Zhou promises more details and early access soon (<a href="https://x.com/realsharonzhou/status/1792578913572429878">link tweet</a>).</li>
  <li><strong>Lawyers demand OpenAI disclose AI voice origin</strong>: Lawyers for Scarlett Johansson are asking OpenAI how it developed its latest ChatGPT voice, which has been compared to Johansson's from the movie "Her." OpenAI has paused using the voice amid public debate, as users point out the tenuous legal arguments around likeness and endorsements (<a href="https://www.npr.org/2024/05/20/1252495087/openai-pulls-ai-voice-that-was-compared-to-scarlett-johansson-in-the-movie-her">NPR article</a>).</li>
  <li><strong>Scale AI raises $1B funding</strong>: Scale AI has announced $1 billion in new funding at a $13.8 billion valuation, led by Accel with participation from prominent investors like Wellington Management and Amazon. CEO Alex Wang stated this positions Scale AI to accelerate the abundance of frontier data and aims for profitability by the end of 2024 (<a href="https://fortune.com/2024/05/21/scale-ai-funding-valuation-ceo-alexandr-wang-profitability/">Fortune article</a>).</li>
  <li><strong>MS Phi 3 Models Released</strong>: Microsoft unveiled the Phi 3 models at MS Build, touting major benchmarks such as the Medium model being competitive with Llama 3 70B and GPT 3.5. The models offer context lengths up to 128K and utilize heavily filtered and synthetic data, released under the MIT license (<a href="https://x.com/reach_vb/status/1792949163249791383">link tweet</a>).</li>
  <li><strong>Emotionally Intelligent AI from Inflection</strong>: Inflection AI's new CEO announced a focus on integrating emotional and cognitive AI abilities, with their empathetic LLM "Pi" now used by over 1 million people daily. This move is aimed at helping organizations harness AI's transformative potential (<a href="https://inflection.ai/redefining-the-future-of-ai">Inflection announcement</a>).</li>
</ul>

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (76 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (43 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (8 messages🔥):

Link mentioned: hpcai-tech/grok-1 · Hugging Face: no description found


OpenAccess AI Collective (axolotl) ▷ #general-help (15 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (5 messages):

Link mentioned: OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.


LlamaIndex ▷ #announcements (1 messages):

Link mentioned: LlamaIndex Webinar: Open-Source Longterm Memory for Autonomous Agents · Zoom · Luma: In this webinar we're excited to host the authors of memary - a fully open-source reference implementation for long-term memory in autonomous agents 🧠🕸️ In…


LlamaIndex ▷ #blog (6 messages):

Links mentioned:


LlamaIndex ▷ #general (45 messages🔥):

Link mentioned: Airtable | 🦜️🔗 LangChain: * Get your API key here.


AI Stack Devs (Yoko Li) ▷ #ai-companion (7 messages):

Link mentioned: Ddlc Doki Doki Literature Club GIF - Ddlc Doki Doki Literature Club Just Monika - Discover & Share GIFs: Click to view the GIF


AI Stack Devs (Yoko Li) ▷ #ai-town-discuss (18 messages🔥):

Links mentioned:


LangChain AI ▷ #general (18 messages🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (3 messages):

Links mentioned:


LangChain AI ▷ #tutorials (1 messages):

bayraktar47: <@1043024658812895333>


OpenInterpreter ▷ #general (13 messages🔥):

Link mentioned: GitHub - OpenInterpreter/01: The open-source language model computer: The open-source language model computer. Contribute to OpenInterpreter/01 development by creating an account on GitHub.


OpenInterpreter ▷ #O1 (3 messages):


OpenInterpreter ▷ #ai-content (1 messages):

ashthescholar.: missed opportunity to make it moo


Cohere ▷ #general (15 messages🔥):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #ai (10 messages🔥):

Links mentioned:


DiscoResearch ▷ #general (6 messages):

Links mentioned:


Mozilla AI ▷ #llamafile (2 messages):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #gpt4 (1 messages):


MLOps @Chipro ▷ #general-ml (1 messages):

Link mentioned: Research log #038: Welcome to Research Log #038! We document weekly research progress across the various initiatives in the Manifold Research Group, and highlight breakthroughs from the broader research community we thi...


{% else %}

The full channel by channel breakdowns are now truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}