Frozen AI News archive

Skyfall

Between 5/17 and 5/20/2024, key AI updates include **Google DeepMind's Gemini 1.5 Pro and Flash models**, featuring sparse multimodal MoE architecture with up to **10M context** and a dense Transformer decoder that is **3x faster and 10x cheaper**. **Yi AI released Yi-1.5 models** with extended context windows of **32K and 16K tokens**. Other notable releases include **Kosmos 2.5 (Microsoft), PaliGemma (Google), Falcon 2, DeepSeek v2 lite, and HunyuanDiT diffusion model**. Research highlights feature an **Observational Scaling Laws paper** predicting model performance across families, a **Layer-Condensed KV Cache** technique boosting inference throughput by **up to 26×**, and the **SUPRA method** converting LLMs into RNNs for reduced compute costs. Hugging Face expanded local AI capabilities enabling on-device AI without cloud dependency. LangChain updated its v0.2 release with improved documentation. The community also welcomed a new LLM Finetuning Discord by Hamel Husain and Dan Becker for Maven course users. *"Hugging Face is profitable, or close to profitable,"* enabling $10 million in free shared GPUs for developers.

Canonical issue URL

AI News for 5/17/2024-5/20/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (366 channels, and 9564 messages) for you. Estimated reading time saved (at 200wpm): 1116 minutes.

While it was a relatively lively weekend, most of the debate was nontechnical in nature, with no announcements being an obvious candidate for this top feature.

So have a list of minor notes in its place:

But who are we kidding, you probably want to read Scarlett's apple notes takedown of OpenAI (:

image.png


Table of Contents

[TOC]


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

AI Model Releases and Updates

Research Papers and Techniques

Frameworks, Tools and Platforms

Discussions and Perspectives

Memes and Humor


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

AI Advancements and Capabilities

AI Safety and Alignment

AI Impact on Jobs and Economy

AI Models and Frameworks

AI Ethics and Societal Impact


AI Discord Recap

A summary of Summaries of Summaries

  1. LLM Fine-Tuning Advancements and Challenges:

    • Unsloth AI enables effective fine-tuning of models like Llama-3-70B Instruct using optimized techniques, but legal concerns around using IPs like Scarlett Johansson suing OpenAI were discussed.
    • The LLM Fine-Tuning course sparked debates on quality, with some finding the initial content basic while others appreciated the hands-on approach to training, evaluation, and prompt engineering.
    • Discussions on LoRA fine-tuning highlighted optimal configurations, dropout, weight decay, and learning rates to prevent overfitting, especially on GPUs like the 3090, as shared in this tweet.
  2. Multimodal and Generative AI Innovations:

    • Hugging Face pledged $10 million in free GPUs to support small developers, academics, and startups in creating new AI technologies.
    • The Chameleon model from Meta showcased state-of-the-art performance in understanding and generating images and text simultaneously, surpassing larger models like Llama-2.
    • GPT-4o integration with LlamaParse enabled multimodal capabilities, while concerns were raised about its Chinese token pollution.
    • Innovative projects like 4Wall AI and AI Reality TV explored AI-driven entertainment platforms with user-generated content and social simulations.
  3. Open-Source Datasets and Model Development:

    • Frustrations mounted over the restrictive non-commercial license of the CommonCanvas dataset, which limits modifications and derivatives.
    • Efforts focused on creating high-quality open-source datasets, like avoiding hallucinations in captions that can damage visual language models (VLLMs) and text-to-image (T2I) models.
    • The Sakuga-42M dataset introduced the first large-scale cartoon animation dataset, filling a gap in cartoon-specific training data.
    • Concerns were raised over the CogVLM2 license restricting use against China's interests and mandating Chinese jurisdiction for disputes.
  4. AI Safety, Ethics, and Talent Acquisition:

    • Key researchers like Jan Leike resigned as head of alignment at OpenAI, citing disagreements over the company's priorities, sparking discussions on OpenAI's controversial employment practices.
    • OpenAI paused the use of the Sky voice in ChatGPT following concerns about its resemblance to Scarlett Johansson's voice.
    • Neural Magic sought CUDA/Triton engineers to contribute to open-source efforts, focusing on activation quantization, sparsity, and optimizing kernels for MoE and sampling.
    • Discussions on the need for better AI safety benchmarks, with suggestions for "a modern LAMBADA for up to 2M" to evaluate models processing overlapping chunks independently (source).

PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


HuggingFace Discord

New Dataset Invites AI Experiments: A Tuxemon dataset has been presented as an alternative to Pokemon datasets, offering cc-by-sa-3.0 licensed images for greater experimentation freedom. It provides images with two caption types for diverse descriptions in experiments.

Progress in Generative AI Learning Resources: Community suggestions included "Attention is All You Need" and the HuggingFace learning portal for those seeking knowledge on Generative AI and LLMs. Discussion of papers such as GROVE and the Conan benchmark for narrative understanding indicates an active interest in advancing collective understanding.

AI Influencers Crafted by Vision and AI: A tutorial video was highlighted, showing how to craft virtual AI influencers using computer vision and AI, reflecting a keen interest in the intersection of technology and social media phenomena.

Tokenizer Set to Reduce Llama Model Size: A newly developed tokenizer, Tokun, promises to shrink Llama models 10-fold while enhancing performance. This novel approach is revealed on GitHub and discussed on Twitter.

Clarifying LLMs Configuration for Task-Specific Queries: AI engineers focused on configuring Large Language Models for HTML generation and maintaining conversation history in chatbots. The community suggested manual intervention, like appending previous messages to the new prompt, to address these nuanced challenges.


Perplexity AI Discord

Frustration with Perplexity's GPT-4o Performance: Engineers noted that GPT-4o's tendency to repeat responses and ignore prompt changes is a step back in conversational AI, with one comparing it unfavorably to previous LLMs and expressing disappointment in its interaction abilities.

Calling All Script Kiddies for Better Model Switching: Users are actively sharing and utilizing custom scripts to enable dynamic model switching on Perplexity, notably with tools like Violentmonkey, which acts as a patch for these service limitations.

API Quirks and Quotas: Confusion exists around Perplexity's API rate limits—differentiating between request and token limits—and its implications for engineers' workflows. Meanwhile, discussions surfaced about API performance testing with a preference for the Omni model and clarifications sought for the threads feature to support conversational contexts.

A Quest for Upgraded API Access: Users continue to press for improved API access, expressing a need for higher rate limits and faster support responses, indicative of growing demands on machine learning infrastructure.

Engineers Explore AI Beyond Chat: Links shared amongst users indicate interests widening to Stability AI's potential, mental boosts from physical exercise, exoplanetary details with WASP-193b, and generating engaging content for children through AI-assisted Dungeons & Dragons scenario crafting.


OpenAI Discord

Voices Silenced: OpenAI has paused the use of the Sky voice in ChatGPT, with a statement and explanation provided to address user concerns.

Language Models Break Free: Engineers report success running LangChain without the OpenAI API, describing integrations with local tools such as Ollama.

GPT-4o Access Rolls Out But With Frictions: Differences between GPT-4 and GPT-4o are evident, with the latter showing limitations in token context windows and caps on usage affecting practical applications. Enhanced, multimodal capabilities of GPT-4o have been recognized, and pricing alongside a file upload FAQ were shared to provide additional usage clarity.

Prompt Crafting Challenges and Innovations: In the engineering quarters, there's a mix of challenges in prompt refining for self-awareness and technical integration, yet innovative prompt strategies are being shared to elevate creative and structured generation. JSON mode is suggested as a viable tool for improving command precision; OpenAI's documentation stands as a go-to reference.

API Pains and Gains: Inconsistencies with chat.completion.create are reported among API users, with incomplete response issues and a demonstrated preference for JSON mode to control format and content. Despite hiccups, there’s a vivid discussion on orchestrating creativity, with someone proposing "Orchestrating Innovation on the Fringes of Chaos" as an explorative approach.


LM Studio Discord


Stability.ai (Stable Diffusion) Discord

Crafting the Perfect Prompt for LoRAs: Engineers have shared a prompt structure to leverage multiple LoRAs in Stable Diffusion, but observed diminishing returns or issues with more than three layers, implying potential optimization avenues.

First-Time Jitters with Stable Diffusion: A 'NoneType' object attribute error is causing a hiccup for a new Stable Diffusion user on the initial run, sparking a call for troubleshooting expertise without a clear resolution.

SD3's Arrival Sparks Anticipation and Doubt: There's a split in sentiment regarding the release of SD3, with a mixture of skepticism and optimism backed by Emad Mostaque's tweet, indicating that work is under way.

Topaz Tussle: The effectiveness of Topaz as a video upscaling solution prompted debate. Engineers acknowledged its strength but contrasted with the appeal of ComfyUI, highlighting considerations like cost and functionality.

Handling the Heft of SDXL: A user underlined the importance of sufficient VRAM when wrangling with SDXL models' demands for higher resolutions, and it was clarified that SDXL and SD1.5 require distinct ControlNet models.


Modular (Mojo 🔥) Discord

Mojo on Windows Still a WIP: Despite active interest, Mojo doesn't natively support Windows and currently requires WSL; users have faced issues with CMD and PowerShell, but Windows support is on the horizon.

Bend vs. Mojo: A Performance Perspective: Discussions highlighted Chris Lattner's insights on Bend's performance, noting that while it’s behind CPython on a single core, Mojo is designed for high-performance scenarios. The communities around both languages are anticipating enhanced features and upcoming community meetings.

Llama's Pythonic Cousin: The community noted an implementation of Llama3 from scratch, available on GitHub, described as building "one matrix multiplication at a time", a fascinating foray into the nitty-gritty of language internals.

Diving Deep into Mojo's Internals: Various discussions included insights into making nightly the default branch to avoid DCO failures, potential list capacity optimization in Mojo, SIMD optimization debates, a suggestion for a new list method similar to Rust’s Vec::shrink_to_fit(), and tackling alias issues that lead to segfaults. Key points brought up included community contributions for list initializations which could lead to performance improvement, and patches affecting performance positively.

Inside the Mind of an Engineer: Technical resolution of PR DCO check failures was discussed with procedural insights provided; flaky tests provoked discussions about fixes and CI pain points; and segfaults in custom array types prompted peer debugging sessions. The community showed appreciation for sharing intricate details that help unravel optimization mysteries.


LLM Finetuning (Hamel + Dan) Discord

Eager learners and burgeoning experts alike remain vested in the transformational tide of fine-tuning, extraction, applications, and other facets of LLMs, suggesting a period filled with intellectual synergies and the relentless pursuit of practical AI engineering prowess.


Nous Research AI Discord


CUDA MODE Discord

Hugging Face Pumps $10M into the AI Community: Hugging Face commits $10 million to provide free shared GPU resources for startups and academics, as part of efforts to democratize AI development. Their CEO Clement Delangue announced this following a substantial funding round, outlined in the company's coverage on The Verge.

A New Programming Player, Bend: A new high-level programming language called Bend enters the scene, sparking questions about its edge over existing GPU languages like Triton and Mojo. Despite Mojo's limitations on GPUs and Triton's machine learning focus, Bend's benefits are enunciated on GitHub.

Optimizing Machine Learning Inference: Experts exchange advice on building efficient inference servers, recommending resources like Nvidia Triton and TorchServe for model serving. Contributions highlighted included applying optimizations when using torch.compile() for static shapes and referencing code improvements on GitHub for better group normalization support in NHWC format, detailed in this pull request.

CUDA Complexities - Addition and Memory: Engaging debates unraveled around atomic operations for cuda::complex and the threshold limitations for 128-bit atomicCAS. The community shared code workarounds and accepted methodologies for complex number handling and discussed potential memory overheads during in-place multiplication in Torch.

Scaling and Optimizing the CUDA Challenge: The community dissected issues with gradient clipping, the potential in memory optimization templating, and ZeRO-2 implementation. They shared multiple GitHub discussions and pull requests (#427, #429, #435), indicating a dedicated focus on performance and fine-tuning CUDA applications.

Tackling ParPaRaw Parser Performance: Inquiries arose regarding benchmarks of libcudf against CPU parallel operations, hinting at the community's enthusiasm for efficient parsing and making note of performance gains in GPUs over CPUs. Attention was given to the merger of Dask-cuDF into cuDF and the subsequent archiving of the former, as seen on GitHub.

Zoom into GPU Query Engines: An upcoming talk promises insights into building a GPU-native query engine from a cuDF veteran at Voltron, illuminating strategies from kernel design to production deployments. Details for tuning in are available through this Zoom meeting.

CUDA Architect Dives into GPU Essentials: A link was shared to a YouTube talk by CUDA Architect Stephen Jones, offering clarity on GPU programming and efficient memory use strategies essential for modern AI engineering tasks. Dive into the GPU workings through the link here.

Seeking Talent for CUDA/Triton Innovations at Neural Magic: Neural Magic is on the lookout for enthusiastic engineers to work on CUDA/Triton projects with a spotlight on activation quantization. They're especially interested in capitalizing on next-gen GPU features such as 2:4 sparsity and further refining kernels in MoE and sampling.

Unpacking PyTorch & CUDA Interactions: A detailed brainstorm ensued over efficient PyTorch data type packing/unpacking for PyTorch with CUDA, with a spotlight on uint2, uint4, and uint8 types. Project management and collaborative programming featured heavily in the discussion, with a nod to GitHub Premier #135 for custom CUDA extension management.

Barrier Synchronization Simplified: A community member helps others grasp the concept of barrier synchronization by comparing it to ensuring all students are back on the bus post a museum visit, a relatable analogy that underpins synchronized processes in GPU operations.

Democratizing Bitnet Protocols: There's a joint effort to host bitnet group meetings and review important tech documentation, with quantization discussions focused on transforming uint4 to uint8 types. Shared resources are guiding these meetings, as mentioned in the collaboration drive.


Eleuther Discord


Interconnects (Nathan Lambert) Discord

Model Melee with Meta, DeepMind, and Anthropic: Meta's Chameleon model boasts 34B parameters and outperforms Flamingo and IDEFICS with superior human evaluations. DeepMind's Flash-8B offers multimodal capabilities and efficiency, while their Gemini 1.5 models excel in benchmarks. Meanwhile, Anthropic scales up with four times the compute of their last model, and LMsys's "Hard Prompts" category brings new challenges for AI evaluations.

AI-Safety Team Breakup Causes Stir: OpenAI's superalignment team, including Ilya Sutskever and Jan Leike, has disbanded amidst disagreements and criticisms of OpenAI's policies. The dismissal and departure agreements at OpenAI drew particular ire due to controversial lifetime nondisparagement clauses.

Podcast Ponderings and Gaming Glory: The Retort AI podcast analyzed OpenAI's moves, spark debates over vocab size scaling laws, and referenced hysteresis in control theory with a hint of humor. Calls of Duty gaming roots and ambitions for academic content creation on YouTube were shared with nostalgia.

Caution with ORPO: Skepticism rose about the ORPO method's scalability and effectiveness, with community members sharing test results suggesting a potential for over-regularization. Concerns about the method were amplified by its addition to the Hugging Face library.

Challenging Chinatalk and Learning from Llama3: A thumbs-up for the Chinatalk episode, the value of llama3-from-scratch as a learning resource, and a clever Notion blog explaining Latent Consistency Models provided informative suggestions for self-development. However, a warning about the legal risks of the Books4 dataset spiced up the dialogue.


Latent Space Discord


LlamaIndex Discord


LAION Discord


AI Stack Devs (Yoko Li) Discord

4Wall Beta Unveiled: 4Wall, an AI-driven entertainment platform, has entered beta, offering seamless AI Town integration and user-generated content tools for creating maps and games. They're also working on 3D AI characters, as showcased in their announcement.

Game Jam Champions: The Rosebud / #WeekOfAI Education Game Jam has announced winners, including "Pathfinder: Terra’s Fate" and "Ferment!", highlighting AI's potential in educational gaming. The games are accessible here, and more details can be found in the announcement tweet.

AI Town's Windows Milestone: AI Town has achieved compatibility natively with Windows, as celebrated in a Tweet, and sparked discussions on innovative implementations, with conversation dump methods using tools like GitHub - Townplayer. Additionally, users are exploring creative scenarios in AI Town using in-depth world context integration.

Launch of AI Reality TV: The launch of an interactive AI Reality TV platform has caught the community's attention, inviting users to simulate social interactions with AI characters, as echoed in this announcement.

Troubleshooting & Technical Tips Abound: AI engineers exchanged solutions to AI Town setup issues, with advice on resolving agent communication problems and extracting data from SQLite databases. Recommendations included checking the memory system documentation and adjusting settings within AI Town.


OpenRouter (Alex Atallah) Discord


OpenAccess AI Collective (axolotl) Discord


LangChain AI Discord

Memory Matters for Model Magic: Re-ranking with cross-encoders behind a proxy is discussed, with a focus on OpenAI GPTs and Gemini models. There's an interest in short-term memory solutions, like a buffer for chatbots to maintain context in conversations.

LangChain Gets a Nudge: Queries about guiding model responses in LangChain led to sharing a PromptTemplate solution, with a reference to a GitHub issue on the topic. Meanwhile, LangChain for Swift developers is available with resources for working on iOS and macOS platforms, as seen in a GitHub repository for LangChain Swift.

SQL Holds the Key: The application of LangChain with SQL data opens the door to summarizing concepts across datasets. The conversation veers toward ways to integrate SQL databases as a memory solution, with a guide found in LangChain's documentation.

Langmem’s Long-term Memory Mastery: Langmem's context management capabilities are commended. A YouTube demonstration shows how Langmem effectively switches contexts and maintains long-term memory during conversations, highlighting its utility for complex dialogue tasks (Langmem demonstration).

Fishy Links Flood the Feed: Multiple channels report a spread of questionable $50 Steam gift links (suspicious link), warning members to proceed with caution and suggesting the link is likely deceptive.

Rubik's Cube of AI: Rubik's AI promises enhanced research assistance, offering two months of free access to premium features with the promo code RUBIX.

Playing with RAG-Fusion: There’s a tutorial on RAG-Fusion, highlighting its use in AI chatbots for document handling and emphasizing its multi-query capabilities over RAG's single-query limitation. The tutorial offers engineers insights into using LangChain and GPT-4o, available at LangChain + RAG Fusion + GPT-4o Project.


Cohere Discord


OpenInterpreter Discord

Hugging Face GPU Bonanza: Hugging Face is donating $10 million in free shared GPU resources to small developers, academics, and startups, leveraging their financial standing and recent investments as outlined in a The Verge article.

OpenInterpreter Tackles Pi 5 and DevOps: OpenInterpreter has been successfully deployed on a Pi 5 using Ubuntu, and a collaboration involving project integration was discussed including potential support with Azure credits. Additionally, a junior full-stack DevOps engineer is seeking community aid to develop a "lite 01" AI assistant module.

Technical Tips and Tricks Abound: Solutions for environment setup issues with OpenInterpreter on different platforms have been shared, with particular discussion focused on WSL, virtual environments, and IDE usage. Further assistance was provided via a GitHub repository for Flutter integration and requests for development help on a device dubbed O1 Lite.

Voice AI's Robo Twang: Community discussions critique voice AI for its lack of naturalness compared to GPT-4's textual capabilities, while an idea for voice assistants' ability to interrupt was highlighted in a YouTube video.

Event and Community Engagement: Notices went out inviting the community to the first Accessibility Round Table and a live stream focused on local development, fostering engagement and knowledge-sharing in live settings.


Mozilla AI Discord


MLOps @Chipro Discord

Fine-Tuning Frenzy Fires Up: Engineers are expressing mixed feelings regarding the LLM Fine-Tuning course, with some finding value in its hands-on approach to LLM training, evaluation, and prompt engineering, while others remain skeptical, citing concerns over the quality amidst promotional tactics.

Mixture of Mastery and Mystery in Course Content: Course participants noted variable experiences, with a few describing the introductory material as basic but dependent on the individual's background; this illustrates the challenge of calibrating content difficulty for diverse expertise levels.

Predictions Wrapped in Intervals: The MAPIE documentation surfaced as a key resource for those looking to implement prediction intervals, and insights were offered on conformal predictions with a nod to Nixtla, suitable for time-series data.

Embeddings Evolve from Inpainting: Comparable to masked language modeling, deriving image embeddings through inpainting techniques was a topic of interest, highlighting a method that estimates unseen image aspects from visible data.

Multi-lingual Entities Enter Evaluation Phase: Strategies for comparing entities across languages, like "University of California" and "Universidad de California," were discussed, possibly incorporating contrastive learning and language-specific prefixes, with arxiv paper mentioned for further reading.


tinygrad (George Hotz) Discord


Datasette - LLM (@SimonW) Discord


LLM Perf Enthusiasts AI Discord

Legal Eagles Eye GPT-4o: AI Engineers have noted that GPT-4o demonstrates notable advances in complex legal reasoning compared to its predecessors like GPT-4 and GPT-4-Turbo. The improvements and methodologies were shared in a LinkedIn article by Evan Harris.


YAIG (a16z Infra) Discord


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (718 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (55 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (454 messages🔥🔥🔥):

- **Error with torch.float16 for llama3**: Users tried to train llama3 with **torch.float16** but encountered errors suggesting to use bfloat16 instead. They sought solutions but found none that worked.
- **Databricks issues with torch and CUDA**: **Torch** caused errors when running on A100 80GB in **Databricks**. Users discussed potential fixes like **setting the torch parameter to False** or updating software versions, but faced challenges.
- **Uploading and using GGUF models**: **Users faced challenges uploading and running models on Hugging Face without config files**. Solutions involved pulling config files from pretrained models or ensuring the correct format and updates.
- **Eager anticipation for mulit-GPU support**: **Community members expressed eagerness for multi-GPU support** from Unsloth, which is in development but not yet available.
- **Troubleshooting environment setup**: Participants had **difficulty setting up environments with both WSL and native Windows** for Unsloth, specifically with installing dependencies like **Triton**.

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (22 messages🔥):


HuggingFace ▷ #general (853 messages🔥🔥🔥):

- **Issue with GPTs Agents on MPS Devices**: A member noted that **GPTs agents** can only load bfloat16 models with MPS devices, as bitsandbytes isn't supported on M1 chips. They expressed frustration with MPS being fast but "running in the wrong direction".
- **Member seeks MLflow deployment help**: Someone asked for assistance in deploying custom models via **MLflow**, specifically for a fine-tuned cross encoder model. They did not receive a direct response from other members.
- **Interest in HuggingChat's limitations**: A user inquired why **HuggingChat** doesn't support files and images. No comprehensive answer was provided.
- **Clarifying technical script adjustments**: Multiple users engaged in debugging and modifying a script for sending requests to a vllm endpoint using **aiohttp** and **asyncio**. Key changes and adaptations were discussed, particularly for integrating with OpenAI's API.
- **Concerns about service and model preferences**: An extensive discussion ensued regarding the benefits and downsides of Hugging Face's **Pro accounts**, spaces creation, and the limitations versus preferences for running models like **Llama**. One member expressed dissatisfaction with needing workarounds for explicit content and limitations on tokens in HuggingChat. Another user sought advice on deployment vs. local computation for InstructBLIP.

Links mentioned:


HuggingFace ▷ #today-im-learning (11 messages🔥):

Links mentioned:


HuggingFace ▷ #cool-finds (18 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (20 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (109 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #core-announcements (1 messages):

Link mentioned: diffusers/tuxemon · Datasets at Hugging Face: no description found


HuggingFace ▷ #computer-vision (25 messages🔥):

Links mentioned:


HuggingFace ▷ #NLP (13 messages🔥):


HuggingFace ▷ #diffusion-discussions (22 messages🔥):

Links mentioned:


Perplexity AI ▷ #general (939 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (12 messages🔥):


Perplexity AI ▷ #pplx-api (19 messages🔥):

Link mentioned: Chat Completions: no description found


OpenAI ▷ #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (347 messages🔥🔥):

Pricing and file upload FAQ links were shared for additional details.


OpenAI ▷ #gpt-4-discussions (167 messages🔥🔥):


OpenAI ▷ #prompt-engineering (178 messages🔥🔥):


OpenAI ▷ #api-discussions (178 messages🔥🔥):

<ul>
  <li><strong>ChatGPT struggles to refine prompts effectively:</strong> Users shared frustrations with <strong>4o's</strong> inability to follow up on prompt corrections or effectively revise rough drafts. One member noted, "it re-writes its original response instead of telling me how to fix my prompt."</li>
  
  <li><strong>Frustrations with incomplete responses:</strong> Users like cicada.exe report experiencing incomplete responses from <code>chat.completion.create</code> despite not exceeding token limits. The issue persists with outputs being abruptly cut off.</li>
  
  <li><strong>Implementing JSON mode:</strong> Ashthescholar advises razeblox to use <a href="https://platform.openai.com/docs/guides/text-generation/json-mode">JSON mode</a> in the API to address response issues, especially regarding format and content control.</li>
  
  <li><strong>Creative writing prompts outperform on GPT-4 compared to 4o:</strong> Users shared that while 4o excels at some creative tasks, it struggles with refining drafts. "4o seems pretty good when given a blank check for creative writing, but if presented with a rough draft to improve, it most often in my experience just regurgitated the rough draft rather than change it," noted keller._.</li>
  
  <li><strong>Innovative approach to creative synthesis:</strong> Stunspot shares a prompt, "Orchestrating Innovation on the Fringes of Chaos," that emphasizes exploring ideas through network dynamics, fractal exploration, adaptive innovation, and resilience to foster breakthroughs.</li>
</ul>

LM Studio ▷ #💬-general (537 messages🔥🔥🔥):

<ul>
    <li><strong>GPTs Agents cannot learn after initial training</strong>: A member asked about the ability to store conversations locally for context searching, to which another member clarified this is not currently possible in LM Studio. They suggested copying and pasting texts but noted that "You can't upload and chat with docs."</li>
    <li><strong>Handling "Unsupported Architecture" Error</strong>: Various members discussed issues with loading GPT-Sw3 in LM Studio due to "Unsupported Architecture." The consensus was that only GGUF files are supported, and users recommended downloading within the app with 'compatibility guess' enabled.</li>
    <li><strong>Running LM Studio on Limited VRAM Systems</strong>: Users inquired about running LLM models on systems with limited VRAM like 6-8GB. Members suggested using smaller models and quantized versions like Q5_K_M for better performance.</li>
    <li><strong>Offline Usage Issues</strong>: A user reported problems with LM Studio not functioning offline. After community suggestions, it was clarified that loading models and then disabling the network should work, but further detailed bug reports were recommended.</li>
    <li><strong>General Troubleshooting and Setup Questions</strong>: Users frequently asked about issues like setting up servers, model compatibility, and performance on lower-spec systems. Many were directed to create detailed posts in a specific channel (<a href="https://discord.com/channels/1111440136287297637">#1139405564586229810</a>) for further assistance.</li>
</ul>

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (82 messages🔥🔥):

Links mentioned:


LM Studio ▷ #announcements (7 messages):

Link mentioned: Tweet from LM Studio (@LMStudioAI): 1. Browse HF 2. This model looks interesting 3. Use it in LM Studio 👾🤗 Quoting clem 🤗 (@ClementDelangue) No cloud, no cost, no data sent to anyone, no problem. Welcome to local AI on Hugging Fa...


LM Studio ▷ #🧠-feedback (10 messages🔥):


LM Studio ▷ #📝-prompts-discussion-chat (32 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (93 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (1 messages):


LM Studio ▷ #autogen (12 messages🔥):


LM Studio ▷ #amd-rocm-tech-preview (21 messages🔥):

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (664 messages🔥🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (74 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (1 messages):

ModularBot: From Modular: https://twitter.com/Modular/status/1791535613411570039


Modular (Mojo 🔥) ▷ #ai (1 messages):

Link mentioned: GitHub - naklecha/llama3-from-scratch: llama3 implementation one matrix multiplication at a time: llama3 implementation one matrix multiplication at a time - naklecha/llama3-from-scratch


Modular (Mojo 🔥) ▷ #tech-news (4 messages):

Link mentioned: no title found: no description found


Modular (Mojo 🔥) ▷ #🔥mojo (397 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (31 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #📰︱newsletter (1 messages):

Zapier: Modverse Weekly - Issue 34 https://www.modular.com/newsletters/modverse-weekly-34


Modular (Mojo 🔥) ▷ #🏎engine (1 messages):

ModularBot: Congrats <@891492812447698976>, you just advanced to level 3!


Modular (Mojo 🔥) ▷ #nightly (114 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #general (242 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #workshop-1 (168 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #asia-tz (47 messages🔥):

Link mentioned: shisa-ai/shisa-v1-llama3-70b · Hugging Face: no description found


LLM Finetuning (Hamel + Dan) ▷ #🟩-modal (54 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #learning-resources (12 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #jarvis (40 messages🔥):


LLM Finetuning (Hamel + Dan) ▷ #hugging-face (18 messages🔥):


LLM Finetuning (Hamel + Dan) ▷ #replicate (8 messages🔥):


LLM Finetuning (Hamel + Dan) ▷ #langsmith (16 messages🔥):


Nous Research AI ▷ #ctx-length-research (1 messages):


Nous Research AI ▷ #off-topic (13 messages🔥):

Link mentioned: Tweet from Sam Altman (Parody) (@SamAltsMan): Well, what a shock. Jan and Ilya left OpenAI because they think I'm not prioritizing safety enough. How original. Now I have to write some long, bs post about how much I care. But honestly, who n...


Nous Research AI ▷ #interesting-links (4 messages):

Links mentioned:


Nous Research AI ▷ #general (315 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (40 messages🔥):


Nous Research AI ▷ #rag-dataset (5 messages):

Link mentioned: How Far Are We From AGI: The evolution of artificial intelligence (AI) has profoundly impacted human society, driving significant advancements in multiple sectors. Yet, the escalating demands on AI have highlighted the limita...


Nous Research AI ▷ #world-sim (88 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #general (38 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (20 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (7 messages):


CUDA MODE ▷ #torch (20 messages🔥):

Link mentioned: Add NHWC support for group normalization by ZelboK · Pull Request #126635 · pytorch/pytorch: Fixes #111824 Currently it is the case that if the user specifies their group normalization to be of NHWC format, pytorch will default to NCHW tensors and convert. This conversion is not immediate...


CUDA MODE ▷ #announcements (1 messages):

Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ...


CUDA MODE ▷ #cool-links (2 messages):

Links mentioned:


CUDA MODE ▷ #jobs (5 messages):


CUDA MODE ▷ #beginner (14 messages🔥):


CUDA MODE ▷ #off-topic (1 messages):

iron_bound: Polish code breaker's https://www.flyingpenguin.com/?p=56989


CUDA MODE ▷ #llmdotc (180 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #lecture-qa (33 messages🔥):

Links mentioned:


CUDA MODE ▷ #youtube-watch-party (2 messages):

Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ...


CUDA MODE ▷ #bitnet (31 messages🔥):

Links mentioned:


Eleuther ▷ #general (219 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (93 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (14 messages🔥):

Link mentioned: Observational Scaling Laws and the Predictability of Language Model Performance: Understanding how language model performance varies with scale is critical to benchmark and algorithm development. Scaling laws are one approach to building this understanding, but the requirement of ...


Eleuther ▷ #lm-thunderdome (13 messages🔥):

Link mentioned: TIGER-Lab/MMLU-Pro · Datasets at Hugging Face: no description found


Eleuther ▷ #gpt-neox-dev (1 messages):


Interconnects (Nathan Lambert) ▷ #ideas-and-feedback (2 messages):


Interconnects (Nathan Lambert) ▷ #news (29 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (145 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (24 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (41 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rlhf (6 messages):

Link mentioned: Tweet from Kawin Ethayarajh (@ethayarajh): @maximelabonne @winniethexu aligned zephyr-sft-beta on ultrafeedback and it looks like kto/dpo are a bit better? note that zephyr-sft-beta was sft'ed on ultrachat (not ultrafeedback) so all the ...


Interconnects (Nathan Lambert) ▷ #reads (2 messages):

Link mentioned: The Scam in the Arena: Chamath Palihapitiya took retail investors for a ride, got away with it, and just can't let himself take the win.


Interconnects (Nathan Lambert) ▷ #posts (1 messages):

SnailBot News: <@&1216534966205284433>


Interconnects (Nathan Lambert) ▷ #retort-podcast (21 messages🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (126 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-in-action-club (127 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #announcements (1 messages):

Link mentioned: LlamaIndex Webinar: Open-Source Longterm Memory for Autonomous Agents · Zoom · Luma: In this webinar we're excited to host the authors of memary - a fully open-source reference implementation for long-term memory in autonomous agents 🧠🕸️ In…


LlamaIndex ▷ #blog (10 messages🔥):

- **QA struggles with large tables**: Even the latest LLMs still hallucinate over complex tables like the Caltrain schedule due to poor parsing. More details can be found [here](https://t.co/Scvp7LH2pL).
- **Boost vector search speed by 32x**: Using 32-bit vectors, [JinaAI_](https://t.co/NnHhGudMa8) shared methods that offer significant performance gains at only a 4% accuracy cost. This optimization is crucial for production applications.
- **Building agentic multi-document RAG**: Plaban Nayak's article explains constructing a multi-document agent using LlamaIndex and Mistral. Each document is modeled as a set of tools for comprehensive summarization, available [here](https://t.co/FksUI3mm5l) and [here](https://t.co/MbDtlrxk5B).
- **Fully local text-to-SQL setup**: Diptiman Raichaudhuri offers a tutorial on setting up a local text-to-SQL system for querying structured databases without external dependencies. This guide is accessible [here](https://t.co/u3LG9NKE0X).
- **San Francisco meetup announcement**: LlamaIndex will host an in-person meetup at their HQ with talks from prominent partners including Tryolabs and Activeloop. The meetup will cover advanced RAG engine techniques; RSVP and more details can be found [here](https://t.co/o0BWxeq3TJ).

Link mentioned: RSVP to GenAI Summit Pre-Game: Why RAG Is Not Enough? | Partiful: Note: This is an in-person meetup @LlamaIndex HQ in SF! Stop by our meetup to learn about latest innovations in building production-grade retrieval augmented generation engines for your company from ...


LlamaIndex ▷ #general (139 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (4 messages):


LAION ▷ #general (134 messages🔥🔥):

Links mentioned:


LAION ▷ #research (13 messages🔥):

Links mentioned:


LAION ▷ #resources (1 messages):


AI Stack Devs (Yoko Li) ▷ #app-showcase (2 messages):

Links mentioned:


AI Stack Devs (Yoko Li) ▷ #ai-companion (1 messages):

.ghost001: They gonna feel dumb when the more advanced versions come out


AI Stack Devs (Yoko Li) ▷ #events (1 messages):

Links mentioned:


AI Stack Devs (Yoko Li) ▷ #ai-town-discuss (38 messages🔥):

Links mentioned:


AI Stack Devs (Yoko Li) ▷ #ai-town-dev (94 messages🔥🔥):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (112 messages🔥🔥):

Reach the full conversation here: OpenRouter Discord.

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (58 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (13 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (3 messages):


OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (37 messages🔥):

Links mentioned:


LangChain AI ▷ #general (69 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #langserve (2 messages):


LangChain AI ▷ #langchain-templates (2 messages):


LangChain AI ▷ #share-your-work (7 messages):

Links mentioned:


LangChain AI ▷ #tutorials (3 messages):

Link mentioned: LangChain + RAG Fusion + GPT-4o Python Project: Easy AI/Chat for your Docs: #automation #rag #llm #ai #programming #gpt4o #langchain in this Video, I have a super quick tutorial for you showing how to create an AI for your PDF with L...


Cohere ▷ #general (76 messages🔥🔥):

Links mentioned:


Cohere ▷ #project-sharing (1 messages):

Links mentioned:


OpenInterpreter ▷ #general (41 messages🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (15 messages🔥):

Link mentioned: GitHub - Tonylib/o1_for_flutter: Contribute to Tonylib/o1_for_flutter development by creating an account on GitHub.


OpenInterpreter ▷ #ai-content (6 messages):

Links mentioned:


Mozilla AI ▷ #llamafile (26 messages🔥):

Links mentioned:


MLOps @Chipro ▷ #events (9 messages🔥):

Link mentioned: Mastering LLMs: End-to-End Fine-Tuning and Deployment by Dan Becker and Hamel Husain on Maven: All-time best selling course on Maven! Train, validate and deploy your first fine-tuned LLM


MLOps @Chipro ▷ #general-ml (7 messages):

Link mentioned: MAPIE - Model Agnostic Prediction Interval Estimator — MAPIE 0.8.3 documentation: no description found


tinygrad (George Hotz) ▷ #general (7 messages):


tinygrad (George Hotz) ▷ #learn-tinygrad (4 messages):


Datasette - LLM (@SimonW) ▷ #ai (6 messages):


LLM Perf Enthusiasts AI ▷ #gpt4 (1 messages):


YAIG (a16z Infra) ▷ #ai-ml (1 messages):