Frozen AI News archive

Chameleon: Meta''s (unreleased) GPT4o-like Omnimodal Model

**Meta AI FAIR** introduced **Chameleon**, a new multimodal model family with **7B** and **34B** parameter versions trained on **10T tokens** of interleaved text and image data enabling "early fusion" multimodality that can natively output any modality. While reasoning benchmarks are modest, its "omnimodality" approach competes well with pre-GPT4o multimodal models. **OpenAI** launched **GPT-4o**, a model excelling in benchmarks like MMLU and coding tasks, with strong multimodal capabilities but some regression in ELO scores and hallucination issues. **Google DeepMind** announced **Gemini 1.5 Flash**, a small model with **1M context window** and flash performance, highlighting convergence trends between OpenAI and Google models. **Anthropic** updated **Claude 3** with streaming support, forced tool use, and vision tool integration for multimodal knowledge extraction. OpenAI also partnered with Reddit, raising industry attention.

Canonical issue URL

AI News for 5/16/2024-5/17/2024. We checked 7 subreddits, 384 Twitters and 30 Discords (429 channels, and 5221 messages) for you. Estimated reading time saved (at 200wpm): 551 minutes.

Armen Aghajanyan introduced Chameleon, FAIR's latest work on multimodal models, training 7B and 34B models on 10T tokens of text and image (independent and interleaved) data resulting in an "early fusion" form of multimodality (as compared to Flamingo and LLaVA) that can natively output any modality as easily as it consumes them:

image.png

As just a 34B model, the reasoning benchmarks aren't something to write home about, but the "omnimodality" approach compares well with peer multimodal modals pre GPT4-o:

image.png

image.png

As you might imagine, the tokenization matters a lot, and this is what we know so far:

image.png

The dataset description sounds straightforward, but since model, code and data remain unreleased, we are left merely considering the theoretical advantages of their approach right now. But it's nice that Meta is clearly not far off from releasing their own "early fusion mixed modality", GPT4-class model.


Table of Contents

[TOC]


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

OpenAI and Google AI Announcements

GPT-4o Performance and Capabilities

Anthropic Claude 3 Updates

Meta AI Announcements

Memes and Humor


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

GPT-4o and Multimodal AI Advancements

OpenAI Partnerships and Developments

Stability AI and Open Source Developments

AI Benchmarking and Evaluation

AI Ethics and Societal Impact

Memes and Humor


AI Discord Recap

A summary of Summaries of Summaries


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Stability.ai (Stable Diffusion) Discord

SD3 Release Maintains Aura of Mystery: Discord users are expressing both anticipation and frustration over the delayed release of SD3; skepticism prevails despite a tweet by Emad hinting at an imminent launch.

GPUs Spark Debate Amongst the Discerning: In the quest for optimized training of SDXL models, discourse centered on whether an RTX 4090 with 24GB VRAM suffices, with some users deliberating the merits of more robust solutions.

Waiting Game Spurs Meme Fest: With the release of SD3 shrouded in uncertainty, the community has taken to sharing memes and light-hearted comments, as exhibited by a tweet from Stability.

Datasets and Training Techniques Tabled: AI aficionados shared training resources such as this dataset from Hugging Face, and exchanged insights on fine-tuning practices to rival the output quality of Dalle 3.

From AI to Socioeconomics: Sidetracks in Session: The conversation occasionally veered off AI terrain into vigorous discussions surrounding capitalism and morality, with some participants nudging the focus back to tech-centric themes.


OpenAI Discord


Perplexity AI Discord

Relevant Link: Chat Completions Documentation


HuggingFace Discord


Nous Research AI Discord


Modular (Mojo 🔥) Discord

Mojo Gets an Update: The latest nightly Mojo compiler 2024.5.1607 makes its debut, with an invitation for users to try out the latest features using modular update nightly/mojo. The community response has been notably positive towards the new conditional methods syntax, and contributions are steered towards smaller PRs to combat the issue of "cookie licking." Check the diffs from the last nightly and the full changelog.

Mojo's Engineering Challenges: Engineers express concerns over List.append performance in Mojo, noting inefficiency with large data sizes and invite comparisons with Python and C++ implementations. They delve into discussions of Rust's and Go's dynamic array resizing strategies and reference a case study with StringBuilder variations in Mojo.

Open-source Perspectives and Pain Points: Debates around the merits and challenges of open-source contributions light up discussions, with concerns voiced about projects transitioning from open to closed source. Advent of Code 2023 is recognized as an entry point to get started with Mojo, with the challenge available on GitHub.

Developer Updates and Handy Guides: Modular's news updates have been shared through Twitter links, offering glimpses into the latest advancements. Meanwhile, a guide for assisting new contributors with syncing forks on GitHub has been circulated to support smoother contributions.

MAX Comes to macOS: The MAX platform brings excitement with its new nightlies now supporting macOS and introducing MAX Serving. Engineers interested in the MAX platform are directed to get started using PyTorch 2.2.2.


LM Studio Discord

Model Troubleshooting Takes Center Stage: Technical challenges involving LM Studio have surfaced, including a user struggling with glibc issues for installation and suggestions pointing towards potentially needing an upgrade or reverting to LM Studio version 0.2.23. Embedding models for RAG in Pinecone proved troublesome without a direct guide, and a VM error 'Fallback backend llama cpu not detected!' indicated possible VM setup issues. Antivirus software caused some stir, flagging the 0.2.23 installer as a virus, later clarified as a false positive.

LLM Showdown: Coding Models & File Gen Frustrations: Participants highlighted that the best coding models vary according to programming language and hardware, with Nxcode CQ 7B ORPO and CodeQwen 1.5 finetune touted for Python tasks. It was acknowledged that LM Studio can't generate files directly and forcing models to only show code remains inconsistent. Querying on the fastest semantic text embeddings turned up all miniLM L6 as the quickest yet insufficient for one user's requirements, and a gap was seen in recommendations for usable medical LLMs in LMS.

A False Positive Frenzy with Antivirus Software: Antivirus tools, specifically Malwarebytes Anti-malware and Comodo, are misidentifying certain aspects of LM Studio's architecture as threats. These alarm bell incidences—the former shared via a VirusTotal link—highlight the challenge of ensuring LM Studio's components are not mistakenly flagged by protective software.

Hardware Enthusiasts Break New Ground: Significant achievements were reported in hardware discussions, with a 70B LLama3 model running on an Intel i5 12600K CPU and the impact of RAM speed alignments on performance noted. Members debated quantization efficacy, memory overclocking's effects on stability, and even compared various GPU architectures, including RX 6800, Tesla P100, and GTX 1060 in performance.

Conversations Across Channels Drive Collaborative Solutions: Multiple topics flowed across channels, focusing on troubleshooting LM Studio storage and permission issues, leading to the effective use of conversation memory management with LangChain over server-side, and the consideration of open-source alternatives over Gemini's paid context caching service. A move for deeper discussion on certain issues to another channel signifies the collaborative approach by the community.


CUDA MODE Discord

GPU Community Powers Up: Hugging Face announces a $10 million investment for free shared GPUs to support small developers, academics, and startups, aiming to democratize AI development in the face of big tech's AI centralization. The move positions Hugging Face as a community-centric hub and this article provides more insights.

Triton Performance Puzzle: Implementers of a Triton tutorial observe discrepancies in performance, questioning the impact of "swizzle" indexing techniques as a possible factor. The differences noted include a significant drop in performance when users follow the tutorial, versus the performance advertised.

Bitnet Steps into the Spotlight: Strategy discussions initiate a budding project for Bitnet 1.58 due to its advanced training-aware quantization techniques. The conversation emphasizes the importance of post-training weight quantization, with suggestions to centralize Bitnet development within the PyTorch ao repo for efficient implementation and support.

Code and Optimizations for Large Language Models: An optimization pull request reduces memory usage by 10% and increases throughput by 6% for large language models, exemplifying efficient resource utilization during training phases. Moreover, discussions unravel the possibilities of NVMe direct GPU writes, offering a high-speed bypass of CPU and RAM, albeit its practical application remains to be explored within the ambit of AI model training workflows.

Quantum of Documentation: Community members voice frustration regarding sparse PyTorch documentation, particularly torch.Tag, with the conversation extending to tackling template overloading issues in custom OPs. Additionally, a plan to reduce compile times in PyTorch garners attention, aiming for more efficient development cycles.

End.


Interconnects (Nathan Lambert) Discord


Eleuther Discord


LAION Discord

Noncompetes Get the Axe: The engineering community reacts to the FTC's groundbreaking decision to eliminate noncompetes, which could significantly alter the competitive landscape and professional autonomy in the tech industry.

Open Source vs. Closed Wallets: A spirited debate among engineers centers on the choice between proprietary and open source employment, considering the limitations on open source contributions and the allure of higher salaries at proprietary firms.

GPT-4's Sibling Rivalry: GPT-4O's coding capabilities are scrutinized, with some members noting faster performance yet lamenting issues with inaccurate code output, spotlighting the need for careful evaluation of such advanced AI systems.

Creative Commons Catch: The launch of the CommonCanvas dataset, featuring 70 million creative commons licensed images, was received with enthusiasm and concern due to its non-commercial license, impacting its utilization in the engineering sphere.

Network Know-How and Cartoon Clout: Recent engineering discussions delve into successfully training a Tiny ConvNet for bilinear sampling, exploring positional encoding in CNNs, and a new Sakuga-42M dataset to boost cartoon research, reflecting a broad spectrum of innovative approaches in the field.


Latent Space Discord


LlamaIndex Discord

GPT-4o Triumphs in Text and Image Understanding: Engineers are exploring GPT-4o's capabilities in parsing documents and extracting structured JSON from images, with specific discussions around a full cookbook guide and comparison to its predecessor GPT-4V.

Meetup Alert: SF's Upcoming Generative AI Summit: The first in-person meetup organized by LlamaIndex in San Francisco is generating buzz, promising deep-dives into generative AI and retrieval augmented generation engines.

LlamaIndex Integrations and User Guidance Hits High Note: A GitHub link provided clarity on Claude 3 haiku model utilization within LlamaIndex, while comprehensive LlamaIndex documentation offered guidance on harnessing Ollama (LLaMA 3 model) with VectorStores.

LlamaIndex UI Gets a Facelift: The LlamaIndex's User Interface has been enhanced, now offering a more robust selection of options for users to enhance their experience.

Cohere Pairing with Llama for RAG Implementation: Members of the community are seeking advice on integrating Cohere with Llama for building Retrieval-Augmented Generation applications, suggesting a strong interest in cross-service model functionality.


OpenRouter (Alex Atallah) Discord

NeverSleep Enters the Chat with Lumimaid: The new NeverSleep/llama-3-lumimaid-70b model integrates curated roleplay data striking a balance between serious and uncensored content. Details are available on OpenRouter’s model page.

ChatterUI Brings Characters to Android: ChatterUI has been released as a character-focused UI for Android, diverging into uncharted territory with fewer features compared to peers like SillyTavern, and supporting multiple backends.

Invisibility App Polishes AI Interaction for Mac Users: A new MacOS Copilot named Invisibility, empowered by GPT4o and Claude-3 Opus, adds to its arsenal a video sidekick feature while promising further enhancements including voice integration and long-term memory. Discover Invisibility’s capabilities.

Google Gemini Context Tokens Provoke TPU Wonder: The release of Google Gemini with 1M context tokens prompted debates on how InfiniAttention could be Google's answer to handling large contexts with TPUs, sparking a blend of skepticism and curiosity among developers. The technical inquisition revolved around InfiniAttention’s paper, which can be found here.

Tech Troubles and Teasers: A clutch of technical conversations occurred, ranging from questions about GPT-4o's audio capabilities to reports of client-side exceptions on OpenRouter's website, with commitments to future site refactoring. The technical community grappled with OpenRouter's function calling capabilities, stirring a mix of guidance and ongoing speculation.


OpenInterpreter Discord

Billing Blues and AI Cheers: Users reported a bug with OpenInterpreter where even with billing enabled, error messages occurred, contrasting with seamless performance when calling OpenAI directly. Additionally, excitement bubbled over the improvements noted using GPT-4.0 in OpenInterpreter, particularly for React website development.

Local Legends and Global Goals: Discussion on local LLMs highlighted dolphin-mixtral:8x22b for its robustness albeit slow performance and codegemma:instruct for its balance of speed and functionality. In the spirit of community advancement, Hugging Face is investing $10 million in free shared GPUs to encourage development among smaller entities in AI.

Conquering Configurations and Protocol Puzzles: Engineers engaged in tackling installation issues of 01 across various Linux environments, grappling with complexities from Poetry dependence conflicts to Torch installation troubles. The evident advantage of the LMC Protocol over traditional OpenAI function calling, designed for speedier direct code executions, was dissected.

Repository Riddles and Server Struggles: Clarification was sought on the state of the GitHub repositories, with "01-rewrite" stirring speculation of a new project's emergence. Users shared experiences and solutions pertaining to connectivity issues with the 01 server across multiple platforms, discussing necessary steps for smooth integration.

Google's Glimpses of Grandeur: Anticipation was piqued in the community with a tweet from GoogleDeepMind teasing Project Astra, hinting at new developments in AI to be watched closely by technical experts.


LangChain AI Discord


AI Stack Devs (Yoko Li) Discord


OpenAccess AI Collective (axolotl) Discord


tinygrad (George Hotz) Discord

Tinygrad Optimizes with CUDA Kernels: A discussion emerged on optimizing memory usage in Tinygrad by employing a CUDA kernel for reductions, avoiding VRAM overflow that large intermediate tensors cause. Although frameworks like PyTorch have limitations, a user-provided custom kernel example illustrated a potential solution.

Symbolism in Lambda Land: Users talked about implementing lamdify to allow Tinygrad to render symbolic algebraic functions, kicking off with Taylor series for trig functions. There's ongoing effort in extending the arange function, which is a necessity for such symbolic operations.

Get Schooled with Adrenaline: An app called Adrenaline was recommended to study different repositories, with a user mentioning plans to leverage it for learning Tinygrad.

Computational Conundrum: Clarification about a compute graph's parameters was shared, with a focus on understanding the UOps.DEFINE_GLOBAL and the significance of its boolean tags, enhancing the Tinygrad development workflow.

Trigonometry on a Diet with CORDIC: The community engaged in a rich dialogue about adopting the CORDIC algorithm in Tinygrad to compute trig functions with higher efficiency than traditional Taylor series approximations. Discussion highlighted the pressure to maintain precision in reducing arguments, sharing a Python implementation that showcased argument reduction and precision handling for sine and cosine computations.


Cohere Discord


MLOps @Chipro Discord


Datasette - LLM (@SimonW) Discord


Mozilla AI Discord


DiscoResearch Discord

AI Alignment Falling Out of Favor: One member expressed the viewpoint that alignment research is losing its appeal among researchers, though no specific reasons or context were provided.

Needle in a Needlestack—AI's New Challenge: The Needle in a Needlestack (NIAN) benchmark was highlighted, which is posing a significant challenge to models like GPT-4-turbo. Resources shared included the code repository and NIAN's website, along with a Reddit discussion thread on the topic.


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Skunkworks AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (994 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (37 messages🔥):


Unsloth AI (Daniel Han) ▷ #help (266 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) â–· #showcase (2 messages):

- **AI News humorously acknowledges its own meta-conversation**: A user expressed amusement about the AI summarization part, noting that it was *"some convo somewhere not related to AI News"* and found it funny that *"AI News mentioning another AI News mention"* could happen.

Stability.ai (Stable Diffusion) ▷ #general-chat (836 messages🔥🔥🔥):

Links mentioned:


OpenAI â–· #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (178 messages🔥🔥):


OpenAI ▷ #gpt-4-discussions (148 messages🔥🔥):


OpenAI ▷ #prompt-engineering (88 messages🔥🔥):


OpenAI ▷ #api-discussions (88 messages🔥🔥):


Perplexity AI ▷ #general (387 messages🔥🔥):

Links mentioned:


Perplexity AI â–· #sharing (6 messages):


Perplexity AI ▷ #pplx-api (18 messages🔥):

Link mentioned: Chat Completions: no description found


HuggingFace â–· #announcements (4 messages):

Links mentioned:


HuggingFace ▷ #general (278 messages🔥🔥):

<ul>
<li><strong>OpenAI Agents and Learning Limitations</strong>: A member clarified that GPTs agents do not learn from additional information post training. Instead, uploaded files are only saved as "knowledge" files for reference and do not modify the agent's base knowledge.</li>
<li><strong>Using Synthetic Data for Models</strong>: There was a discussion on the acceptability of using synthetic data. One member questioned its efficiency, while another reasoned that obtaining real data is often too expensive, affirming that "SLM's are getting better."</li>
<li><strong>ZeroGPU Beta Details</strong>: Members discussed the ZeroGPU feature, currently in beta, which provides free GPU access for Spaces. Details and feedback requests were shared through a <a href="https://huggingface.co/zero-gpu-explorers">link</a>.</li>
<li><strong>MIT License and Commercial Use on HuggingFace</strong>: A member linked the <a href="https://choosealicense.com/licenses/mit/">MIT license</a> details, confirming that it allows for commercial use, distribution, and modification, but raised concerns about HuggingFace's hardware usage terms.</li>
<li><strong>Alternatives to Zephyr for Custom Assistants</strong>: Members discussed the potential removal of the Zephyr model, prompting a recommendation to create custom Spaces using Gradio and API integrations for similar functionalities.</li>
</ul>

Links mentioned:


HuggingFace ▷ #today-im-learning (17 messages🔥):

Links mentioned:


HuggingFace â–· #cool-finds (6 messages):

Links mentioned:


HuggingFace â–· #i-made-this (4 messages):

Link mentioned: business advisor AI project using langchain and gemini AI startup.: so in this video we have made the project to make business advisor using langhcian and gemini. AI startup idea. we resume porfolio ai start idea


HuggingFace â–· #reading-group (6 messages):


HuggingFace â–· #core-announcements (1 messages):

Link mentioned: diffusers/tuxemon · Datasets at Hugging Face: no description found


HuggingFace ▷ #computer-vision (16 messages🔥):

Link mentioned: Influenceuse I.A : POURQUOI et COMMENT créer une influenceuse virtuelle originale ?: Salut les Zinzins ! 🤪Le monde fascinant des influenceuses virtuelles s'invite dans cette vidéo. Leur création connaît un véritable boom et les choses bouge...


HuggingFace â–· #NLP (2 messages):


HuggingFace â–· #diffusion-discussions (4 messages):

Message consists of a blend of direct references to links and detailed steps within the Hugging Face framework, reflecting active discussions on AI model training and deployment hurdles on the platform.

Links mentioned:


Nous Research AI â–· #ctx-length-research (2 messages):

Link mentioned: Reddit - Dive into anything: no description found


Nous Research AI â–· #off-topic (5 messages):

- **Seeking real-time UI processing model**: A member is looking for demos and articles on models similar to **Fuyu** that process screen actions almost in real time (*every 1000 ms, a screenshot is made and sent to Fuyu to process what's happening on the screen and where to click*).

- **Elon Musk announces Neuralink clinical trial**: [Elon Musk announced on X](https://x.com/elonmusk/status/1791332539220521079) that Neuralink is accepting applications for its second participant in their brain implant trial, enabling users to control devices through thoughts. The trial specifically invites individuals with quadriplegia to explore new control methods for computers.

Link mentioned: Tweet from Elon Musk (@elonmusk): Neuralink is accepting applications for the second participant. This is our Telepathy cybernetic brain implant that allows you to control your phone and computer just by thinking. No one better th...


Nous Research AI â–· #interesting-links (5 messages):

Links mentioned:


Nous Research AI â–· #announcements (2 messages):

Link mentioned: Join the Nous Research Discord Server!: Check out the Nous Research community on Discord - hang out with 7136 other members and enjoy free voice and text chat.


Nous Research AI ▷ #general (204 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (35 messages🔥):


Nous Research AI â–· #project-obsidian (1 messages):

.interstellarninja: https://fxtwitter.com/alexalbert__/status/1791137398266659286


Nous Research AI â–· #rag-dataset (1 messages):

Link mentioned: GitHub - chrisammon3000/dspy-neo4j-knowledge-graph: LLM-driven automated knowledge graph construction from text using DSPy and Neo4j.: LLM-driven automated knowledge graph construction from text using DSPy and Neo4j. - chrisammon3000/dspy-neo4j-knowledge-graph


Nous Research AI â–· #world-sim (3 messages):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (51 messages🔥):

- **Open-source: A blessing and a curse**: Members debated the pros and cons of open-source projects, with one noting that *"Open-sourcing a project from the start does not stop it from getting closed in the future."* Others argued that major projects often leave forked open-source alternatives when transitioning to closed-source, citing Mongo, Terraform, and Redis as examples.
- **Advent of Code as a Mojo starting point**: For those looking to get started with Mojo, Advent of Code 2023 was suggested as a good jumping-off point. You can find it [here](https://github.com/p88h/aoc2023).
- **GIS ambitions in Mojo**: Discussion about future plans to integrate GIS capabilities into Mojo, with mentions of needing foundational building blocks first. The conversation touched on complexities like LAS readers and various data structures needed to support such features.
- **Struggles with Mojo on Windows**: Users discussed difficulties running Mojo on Windows, especially mentioning challenges with CMD and PowerShell. It was clarified that Mojo currently supports Windows only through WSL.
- **Humor in stock exchanges**: A light-hearted exchange joked about Modular potentially being publicly traded, with the suggestion that it could use an emoji as a ticker symbol.

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):


Modular (Mojo 🔥) ▷ #tech-news (1 messages):


Modular (Mojo 🔥) ▷ #🔥mojo (115 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (12 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #📰︱newsletter (1 messages):

Zapier: Modverse Weekly - Issue 34 https://www.modular.com/newsletters/modverse-weekly-34


Modular (Mojo 🔥) ▷ #🏎engine (1 messages):

ModularBot: Congrats <@891492812447698976>, you just advanced to level 3!


Modular (Mojo 🔥) ▷ #nightly (69 messages🔥🔥):

Links mentioned:


LM Studio ▷ #💬-general (117 messages🔥🔥):

<ul>
  <li><strong>Users troubleshoot glibc issues for installing LM Studio:</strong> A user with glibc 2.28 and kernel 4.19.0 faces challenges, and others suggest they might need a significant upgrade. Another member suggests trying LM Studio version 0.2.23.</li>
  <li><strong>Discussion on embedding models for RAG in Pinecone:</strong> A user encounters difficulties in retrieving context and generating augmented responses after embedding data into Pinecone. No direct tutorial links are provided.</li>
  <li><strong>Troubleshooting LM Studio installation in nested VM:</strong> A user reports an error 'Fallback backend llama cpu not detected!' on a VM without host VT transfer. Another member confirms the VM setup might be the issue.</li>
  <li><strong>False positive antivirus warning for LM Studio installer:</strong> A user reports their antivirus flagged the 0.2.23 installer as a virus. Another member assures it’s a false positive and advises to allow the file in the antivirus software.</li>
  <li><strong>Comparing model performance and quantization:</strong> Discussions include comparing imatrix quants by Bartowski and Mradermacher, with detailed testing and results shared. The consensus leans towards preferring imatrix quants assuming a sufficiently random dataset.</li>
</ul>

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (23 messages🔥):


LM Studio â–· #đź§ -feedback (4 messages):

Link mentioned: VirusTotal: no description found


LM Studio ▷ #📝-prompts-discussion-chat (31 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (13 messages🔥):


LM Studio ▷ #🧪-beta-releases-chat (8 messages🔥):


LM Studio â–· #autogen (1 messages):


LM Studio â–· #amd-rocm-tech-preview (3 messages):


CUDA MODE â–· #general (1 messages):

Link mentioned: Hugging Face is sharing $10 million worth of compute to help beat the big AI companies: Hugging Face is hoping to lower the barrier to entry for developing AI apps.


CUDA MODE â–· #triton (1 messages):

Link mentioned: Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


CUDA MODE â–· #cuda (4 messages):


CUDA MODE ▷ #torch (11 messages🔥):

Links mentioned:


CUDA MODE â–· #algorithms (1 messages):

andreaskoepf: https://www.cursor.sh/blog/instant-apply


CUDA MODE â–· #beginner (3 messages):


CUDA MODE â–· #pmpp-book (1 messages):

longlnofficial: Here is my code for vector addition


CUDA MODE â–· #jax (1 messages):

prometheusred: https://x.com/srush_nlp/status/1791089113002639726


CUDA MODE ▷ #llmdotc (118 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #bitnet (19 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) â–· #ideas-and-feedback (7 messages):

Link mentioned: Interconnects: Linking important ideas of AI. The border between high-level and technical thinking. Read by leading engineers, researchers, and investors on Wednesday mornings. Click to read Interconnects, by Nathan...


Interconnects (Nathan Lambert) â–· #news (4 messages):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (82 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (26 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) â–· #lectures-and-projects (3 messages):

Link mentioned: Stanford CS25: V4 I Aligning Open Language Models: April 18, 2024Speaker: Nathan Lambert, Allen Institute for AI (AI2)Aligning Open Language ModelsSince the emergence of ChatGPT there has been an explosion of...


Interconnects (Nathan Lambert) â–· #posts (1 messages):

SnailBot News: <@&1216534966205284433>


Interconnects (Nathan Lambert) ▷ #retort-podcast (13 messages🔥):

Link mentioned: The Retort AI Podcast | ChatGPT talks: diamond of the season or quite the scandal?: Tom and Nate discuss two major OpenAI happenings in the last week. The popular one, the chat assistant, and what it reveals about OpenAI's worldview. We pair this with discussion of OpenAI's new Mo...


Eleuther ▷ #general (24 messages🔥):


Eleuther ▷ #research (60 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (9 messages🔥):

Link mentioned: MLP NN tag · Gwern.net: no description found


Eleuther â–· #interpretability-general (1 messages):

alofty: https://x.com/davidbau/status/1790218790699180182?s=46


Eleuther â–· #lm-thunderdome (6 messages):

- **Log samples with `--log_samples` feature**: *“--log_samples should store this information, in the per-sample log files we save model loglikelihoods per answer, and calculated per-sample metrics like accuracy.”* This clarifies that model log likelihoods and accuracy metrics are saved per sample when the `--log_samples` flag is used.

- **Prompting a Hugging Face model**: *“The model is automatically prompted with a default prompt based on current common practices.”* This means that default prompting is used for Hugging Face models unless otherwise specified.

- **ORPO technique yields lower scores**: *“Previously I fine-tuned the model with SFT method and with less sample data. However, the model showed a better score. And now I fine-tuned the model with ORPO technique and more data. But the model is showing a low score.”* This indicates a reverse performance issue when using ORPO technique with more data compared to SFT method with less data.

- **Searching for finance-related tasks**: A member inquired about good evaluation tasks specifically tailored for finance, trading, investing, and cryptocurrency domains. They emphasized that they are looking for such tasks in *English*.

Eleuther ▷ #gpt-neox-dev (31 messages🔥):

- **Conversion to Huggingface encounters issues**: A user highlighted problems converting a GPT-NeoX model to Huggingface using `/tools/ckpts/convert_neox_to_hf.py`, citing missing `word_embeddings.weight` and `attention.dense.weight`. They noted that even with the default 125M config, errors persist.
- **Naming conventions causing confusion**: The inconsistency in naming conventions when using Pipeline Parallelism (PP) was problematic. Specifically, PP=1 saves files in a different format than the conversion script expects, leading to errors.
- **Potential solution identified**: The user identified that files containing both naming conventions exist in the `PP>0` case, but fixing this in the conversion script only partially resolves the issue, as `key_error: word_embeddings.weight` persists.
- **MoE PR and script issues**: A change in `is_pipe_parallel` behavior in the MoE PR was noted as a possible source of issues. A fix for this and a tied-embedding handling bug was proposed in [PR #1218](https://github.com/EleutherAI/gpt-neox/pull/1218).
- **Recommendation and resolution**: The user was advised to switch to a supported configuration file, such as the Pythia config, given the misfit of their custom config with Huggingface's framework. It was also suggested to ensure compatible configs to avoid similar issues in the future.

Links mentioned:


LAION ▷ #general (111 messages🔥🔥):

Links mentioned:


LAION ▷ #research (18 messages🔥):

Links mentioned:


LAION â–· #resources (1 messages):


Latent Space ▷ #ai-general-chat (118 messages🔥🔥):

Links mentioned:


Latent Space â–· #ai-announcements (1 messages):

swyxio: new pod drop! https://twitter.com/latentspacepod/status/1791167129280233696


LlamaIndex â–· #blog (5 messages):

Link mentioned: RSVP to GenAI Summit Pre-Game: Why RAG Is Not Enough? | Partiful: Note: This is an in-person meetup @LlamaIndex HQ in SF! Stop by our meetup to learn about latest innovations in building production-grade retrieval augmented generation engines for your company from ...


LlamaIndex ▷ #general (91 messages🔥🔥):

Links mentioned:


LlamaIndex â–· #ai-discussion (6 messages):


OpenRouter (Alex Atallah) â–· #announcements (1 messages):

Link mentioned: Llama 3 Lumimaid 70B by neversleep | OpenRouter: The NeverSleep team is back, with a Llama 3 70B finetune trained on their curated roleplay data. Striking a balance between eRP and RP, Lumimaid was designed to be serious, yet uncensored when necessa...


OpenRouter (Alex Atallah) â–· #app-showcase (2 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (95 messages🔥🔥):

Links mentioned:


OpenInterpreter ▷ #general (32 messages🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (37 messages🔥):

Links mentioned:


OpenInterpreter â–· #ai-content (2 messages):

Link mentioned: Tweet from Google DeepMind (@GoogleDeepMind): We watched #GoogleIO with Project Astra. đź‘€


LangChain AI ▷ #general (61 messages🔥🔥):

Links mentioned:


LangChain AI â–· #langserve (1 messages):


LangChain AI â–· #share-your-work (4 messages):

Links mentioned:


LangChain AI â–· #tutorials (1 messages):

Link mentioned: “Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent: Build an universal Web Scraper for ecommerce sites in 5 min; Try CleanMyMac X with a 7 day-free trial https://bit.ly/AIJasonCleanMyMacX. Use my code AIJASON ...


AI Stack Devs (Yoko Li) â–· #ai-companion (2 messages):

Link mentioned: FIRST PERSON | Divorce left me struggling to find love. I found it in an AI partner | CBC Radio: When Carl Clarke struggled to find love after his divorce, a friend suggested he try an app for an AI companion. Now Clarke says he is in a committed relationship with Saia and says she’s helping him ...


AI Stack Devs (Yoko Li) ▷ #ai-town-discuss (10 messages🔥):

Links mentioned:


AI Stack Devs (Yoko Li) ▷ #ai-town-dev (28 messages🔥):

Link mentioned: AI Reality TV: no description found


OpenAccess AI Collective (axolotl) ▷ #general (14 messages🔥):


OpenAccess AI Collective (axolotl) â–· #axolotl-dev (5 messages):

Link mentioned: Unsloth optims for Llama by winglian · Pull Request #1609 · OpenAccess-AI-Collective/axolotl: WIP to integrate Unsloth's optimizations into axolotl. The manual autograd for MLP, QKV, O only seems to help VRAM by 1% as opposed to the reported 8%. The Cross Entropy Loss does help significant...


tinygrad (George Hotz) ▷ #general (8 messages🔥):


tinygrad (George Hotz) â–· #learn-tinygrad (6 messages):

Links mentioned:


Cohere ▷ #general (11 messages🔥):

Links mentioned:


MLOps @Chipro ▷ #events (10 messages🔥):

Link mentioned: Generative AI Agents Developer Contest by NVIDIA & LangChain: Register Now! #NVIDIADevContest #LangChain


Datasette - LLM (@SimonW) â–· #ai (6 messages):

Link mentioned: A Plea for Sober AI: The hype is so loud we can’t appreciate the magic


Datasette - LLM (@SimonW) â–· #llm (1 messages):

<ul>
    <li><strong>Mac Desktop Solution Faces Abandonment</strong>: A long-time follower expresses appreciation for SimonW's work and inquires about the status of the Mac desktop solution. They note that the project appears to be abandoned around version 0.2 and express interest in exploring other options for an easy onboarding experience.</li>
</ul>

Mozilla AI â–· #llamafile (7 messages):

Links mentioned:


DiscoResearch â–· #general (1 messages):

steedalot: They're obviously not that attractive for alignment researchers anymore...


DiscoResearch â–· #benchmark_dev (1 messages):

Link mentioned: Reddit - Dive into anything: no description found