Frozen AI News archive

Google I/O in 60 seconds

**Google** announced updates to the **Gemini model family**, including **Gemini 1.5 Pro** with **2 million token support**, and the new **Gemini Flash** model optimized for speed with **1 million token capacity**. The Gemini suite now includes **Ultra**, **Pro**, **Flash**, and **Nano** models, with **Gemini Nano** integrated into **Chrome 126**. Additional Gemini features include **Gemini Gems** (custom GPTs), **Gemini Live** for voice conversations, and **Project Astra**, a live video understanding assistant. The **Gemma model family** was updated with **Gemma 2** at **27B parameters**, offering near-**llama-3-70b** performance at half the size, plus **PaliGemma**, a vision-language open model inspired by **PaLI-3**. Other launches include **DeepMind's Veo**, **Imagen 3** for photorealistic image generation, and a **Music AI Sandbox** collaboration with YouTube. **SynthID watermarking** now extends to text, images, audio, and video. The **Trillium TPUv6** codename was revealed. Google also integrated AI across its product suite including Workspace, Email, Docs, Sheets, Photos, Search, and Lens. *"The world awaits Apple's answer."*

Canonical issue URL

AI News for 5/13/2024-5/14/2024. We checked 7 subreddits, 384 Twitters and 30 Discords (426 channels, and 8590 messages) for you. Estimated reading time saved (at 200wpm): 782 minutes.

Google I/O is still ongoing, and it is a good deal harder to cover than OpenAI's half-hour event yesterday because of the sheer scope of products, and we haven't yet come across a single webpage that summarizes everything (apart from @Google and @OfficialLoganK accounts).

Here is a subjectively sorted list:

The Gemini Model Family

The Gemma Model Family

Other Launches

And AI deployments across Google's product suite - Workspace, Email, Docs, Sheets, Photos, Search Overviews, Search with Multi-step reasoning, Android Circle to Search, Lens.

Overall a very competently executed I/O, easy to summarize without losing too much detail. The world awaits Apple's answer.


Table of Contents

[TOC]


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

GPT-4o Release by OpenAI

Technical Analysis and Implications

Community Reactions and Memes


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

GPT-4o Capabilities and Features

GPT-4o Availability and Pricing

Reactions and Comparisons

Open Source and Competitors

Memes and Humor


AI Discord Recap

A summary of Summaries of Summaries

Claude 3 Sonnet

Here are the top 3-4 major themes from the content, with important key terms, facts, URLs, and examples bolded:

  1. New AI Model Releases and Comparisons:

    • OpenAI's GPT-4o is a new flagship multimodal model that can process audio, vision, and text in real-time. It boasts faster response times, lower costs, and improved reasoning capabilities compared to GPT-4. Example showcasing GPT-4o's interactive abilities.
    • The Falcon 2 11B model outperforms Meta's Llama 3 8B and rivals Google's Gemma 7B, offering multilingual and vision-to-language capabilities.
    • Claude 3 Opus is still preferred by some users for complex reasoning tasks over GPT-4o, despite concerns over its cost and usage restrictions.
  2. AI Model Optimization and Efficiency Efforts:

    • Implementing ZeRO-1 in llm.c increased GPU batch size and training throughput by ~54%, enabling larger model variations.
    • The ThunderKittens library promises faster inference and potential training speed improvements for LLMs through optimized CUDA tile primitives.
    • Discussions focused on reducing AI's compute usage, with links shared to projects like Based and FlashAttention-2.
  3. Multimodal AI Applications and Frameworks:

  4. Open-Source AI Model Development and Deployment:

Claude 3 Opus

GPT4T (gpt-4-turbo-2024-04-09)

Major Themes:

  1. Advancement of AI Models: Various channels buzz with discussions about the latest AI models, like GPT-4o, Falcon 2, and LLaMA models. These models boast enhanced capabilities like multimodal functionalities and real-time processing, with integration into platforms such as Perplexity AI and OpenRouter.

  2. Community Engagement and Collaborations: There's an increasing interest in sharing projects, seeking collaborations, and participating in discussions around coding practices, optimizations, and the integration of new technologies within community platforms such as Stability.ai, Modular, and LAION, demonstrating a thriving ecosystem focused on collective growth and learning.

  3. Customization and Personalization Questions: Users show a keen interest in customizing AI models and systems to fit specific needs, ranging from setting up private instances of AI tools to merging different model capabilities, reflecting an ongoing trend of personalizing AI use to meet individual or organizational requirements.

  4. Technical Challenges and Debugging: A common thread across several discords centers around troubleshooting and problem-solving specific to AI models and computing environments. This includes discussions on optimizing model inference, handling specific library issues, and improving integration with various coding environments.

  5. Educational Content and Resource Sharing: Several channels are dedicated to educational content ranging from detailed explanations of machine learning concepts to sharing tutorials and resources to help members learn about and implement AI technologies. This not only helps in skill development but also fosters a culture of knowledge sharing within the community.

GPT4O (gpt-4o-2024-05-13)

  1. Model Launches and Innovations:

    • GPT-4o: Many discords are abuzz with OpenAI's launch of GPT-4o, a multimodal model capable of handling text, audio, and vision inputs. This model promises significant advancements in speed, context windows (up to 128K tokens), and overall capabilities. OpenAI's GPT-4o is praised for real-time multimodal capabilities but also criticized for some quirks and high usage costs (GPT-4o Info).
    • Falcon 2: Highlighted as a competitive model against Meta's Llama 3 8B and Google's Gemma 7B. It is praised for being open-source, multilingual, and multimodal. Falcon 2 Announcement.
    • Claude 3 Opus: Its strength lies in handling long-form reasoning tasks and text summarization despite facing cost and policy concerns. Claude 3 Opus.
  2. Performance and Technical Discussions:

    • GPU Utilization: Many discussions revolve around optimizing GPU usage for different models such as Stable Diffusion, YOLOv1, and implementation techniques in Flash Attention 2. This includes guide sharing and configuration tips like the effectiveness of ThunderKittens in speeding up inference and training (GitHub - ThunderKittens).
    • API and Performance Enhancements: Conversations on API performance specifically focus on optimizing response times and handling larger context windows. For instance, GPT-4o API is noted for faster speed and better performance at reduced costs.
  3. Community Tools and Support:

    • Projects and Tools Sharing: From job search assistants using Retrieval-Augmented Generation to detailing steps for setting up AI tools like OpenRouter with community-developed utilities. There is significant sharing of personal projects and collaborative efforts (Job Search Assistant Guide, OpenRouter Model Watcher).
    • Help and Collaboration: A recurring theme is troubleshooting and providing support for issues encountered during AI development, such as CUDA errors, model fine-tuning, and dependency management.
  4. Ethics and Policy:

    • Content Moderation and Policies: ETHICAL concerns around the usage and policies governing AI tools, specifically Claude 3 Opus and GPT-4o moderation filters (Anthropic Policy Link).
    • Open-Source vs Proprietary Models: Discussions often compare open-source advantages like Falcon 2 against proprietary models' constraints, impacting their accessibility and modifications.

PART 1: High level Discord summaries

OpenAI Discord

GPT-4o Makes Its Grand Entrance: OpenAI launched a new model, GPT-4o, with free access for certain features and additional benefits for Plus users, including faster response times and more extensive features. GPT-4o distinguishes itself by processing audio, vision, and text in real-time, indicating a significant step forward in multimodal applications with text and image inputs already available and voice and video to be rolled out soon. Read more about GPT-4o.

Claude Claims Complex Task Crown: Within the community, Claude Opus is considered superior for complex, long-form reasoning compared to GPT-4o, particularly when processing extensive original content. Expectations are high for future enhancements that include broader context windows and advanced voice capabilities from both Google and OpenAI.

Custom GPTs Await Memory Upgrade: The awaited cross-session context memory for custom GPTs remains in development, with an assurance that once released, memory will be configurable by the creators per GPT. Enhanced speeds and consistent API performance mark the current state of GPT-4o, though Plus users benef from higher message limits, and everyone eagerly awaits the promised integration within custom GPT models.

Prompt Engineering Exposes Model Quirks: Users faced challenges when directing GPT-4o towards creative and spatially aware tasks, noting difficulties in iterative image generation and specific content moderation issues with Gemini 1.5's safety filters. Even as GPT-4o accelerates response times, it occasionally stumbles in comprehension and execution, indicating room for iterative refinement based on user feedback.

Monitored ChatGPT Clone Sought: A member inquired about creating a ChatGPT-like application that allows organizational monitoring of messages using the GPT-3.5 model. This reflects a growing need for customizable and controllable AI tools within formal ecosystems.


Perplexity AI Discord

GPT-4's Token Tussle: There's debate around GPT-4's token capacity, with clarification that GPT-4's larger context window applies to specific models like GPT-4o which has a 128K token context window. Some users are diving into the capabilities of GPT-4o, noting its velocity and performance excellence, and sharing video examples of its real-time reasoning.

Policy Shift Sparks Chatter: Anthropic's revised terms of service for Opus, going live on June 6th, have members in a stir due to limitations like the ban on creating LGBTQ content. Details of the policy can be found in the shared Anthropic policy link.

Claude Maintains Its Ground: Despite the buzz around GPT-4o, Claude 3 Opus is still the go-to for text summarization and human-like responses for some users, despite concerns over cost and use restrictions.

Perplexity's New Power Player: Users are testing GPT-4o's integration into Perplexity's tools, highlighting its high-speed, in-depth responses. The Pro version allows for 600 queries a day, echoing its API availability.

API Config Conundrums: Discussions surfaced around Perplexity's API settings, with a user inquiring about timeout issues for lengthy inputs using llama models. One member indicated that the chat model of llama-3-sonar-large-32k-chat is fine-tuned for dialogue contexts, yet no consensus on the optimal timeout settings was reported.


Unsloth AI (Daniel Han) Discord

LLaMA Instruction Tuning Advice: For finetuning on small datasets, start with the instruction model of Llama-3 before considering the base model if performance is suboptimal, as per users’ discussions. They recommend iteration to find the best fit for your scenario.

ThunderKittens Exceeds Flash Attention 2: ThunderKittens overtakes Flash Attention 2 in speed, per mentions in the community, promising faster inference and potential advancements in training speeds. The code is available on GitHub.

Synthetic Dataset Construction for Typst: To effectively fine-tune models on "Typst," engineers propose synthesizing 50,000 examples. The daring task of generating substantial synthetic datasets has been been flagged as a fundamental step for progress.

Multimodal Model Expansion on Unsloth AI: Upcoming support for multimodal models has been anticipated in Unsloth AI, including multi-GPU support expected next week, setting a pace for new robust AI capabilities.

A Million Cheers for Unsloth AI: The AI community celebrates Unsloth AI surpassing one million model downloads on Hugging Face, signaling a milestone recognized by users and reflecting the community’s active engagement and support.


Latent Space Discord


Nous Research AI Discord


Stability.ai (Stable Diffusion) Discord


LM Studio Discord


HuggingFace Discord

YOCO Cuts Down on GPU Needs: The YOCO paper introduces a new decoder-decoder architecture that cuts GPU memory usage while speeding up the prefill stage, maintaining global attention capabilities.

When NLP and AI Storytelling Collide: Researchers are pulling from the Awesome-Story-Generation GitHub repository to contribute to comprehensive studies on AI story generation, such as the GROVE framework, aimed at increasing story complexity.

Stable Diffusion Ventures into DIY Territory: A Fast.ai course spans over 30 hours, teaching Stable Diffusion from scratch, partnering with industry insiders from Stability.ai and Hugging Face, discussed alongside queries about sadtalker installation and practical uses for transformer agents.

OCR Quality Frontier: A collection of OCR-quality classifiers showcases the feasibility of distinguishing between clean and noisy documents using compact models.

Stable Diffusion and YOLO: A HuggingFace guide on Stable Diffusion using Diffusers is available, and conversations revolve around YOLOv1 implementations using ResNet18, balancing data quality and quantity issues to improve model performance.

Mixed Sentiments on the Cutting Edge: GPT-4o's announcement led to diverse reactions within the community, raising concerns about distinguishing AI from humans, while members reported mixed success with custom tokenizer creation and NLP strategies focused on example-rich prompts.


OpenRouter (Alex Atallah) Discord

New Multimodal Models Storm OpenRouter: OpenRouter has expanded its lineup with the launch of GPT-4o, noted for supporting text and image inputs, and LLaVA v1.6 34B. Additionally, the roster now includes DeepSeek-v2 Chat, DeepSeek Coder, Llama Guard 2 8B, Llama 3 70B Base, Llama 3 8B Base, with GPT-4o's latest iteration dating May 13, 2024.

Blazing through Beta: An advanced research assistant and search engine is being beta-tested, offering premium access with leading models like Claude 3 Opus and Mistral Large, and the platform shared a promo code RUBIX for trials.

GPT-4o Enthusiasm and Scrutiny: A vivacious discussion about GPT-4o's API pricing ($5/15 per 1M tokens) sparked excitement, whereas speculation about its multimodal capabilities has piqued curiosity, with commentators noting the lack of native image handling via OpenAI's API.

Community Weighs in on OpenRouter Hiccups: Technical difficulties with OpenRouter were voiced by users, identifying issues such as empty responses and errors from models like MythoMax and DeepSeek. Alex Atallah clarified that most models on OpenRouter are FP16, with some quantized exceptions.

Engineering Connection over Community Tools: A community-developed tool to sort through OpenRouter models has been positively received, with suggestions to integrate additional metrics like ELO scores and model add-dates being discussed. Links to related resources such as OpenRouter API Watcher were provided.


Interconnects (Nathan Lambert) Discord

GPT-4o Leads the Frontier: OpenAI's GPT-4o sets a new benchmark in AI capabilities, especially in reasoning and coding, dominating LMSys arena and featuring a doubled token capacity thanks to a tokenizer update. Its multi-modal prowess was also showcased including potential singing abilities, stirring both interest and debate around AI evolution and its competitive landscape.

REINFORCE Under PPO's Umbrella: The AI community discusses a new PR from Hugging Face that positions REINFORCE as a subset of PPO, detailed in a related paper, showing active contributions in the realm of reinforcement learning.

AI's Silver Screen Reflects Real Concerns: Dialogues within the community resonate with the movie "Her", highlighting how AI interaction can be perceived as either trivial or profound. These discussions tie in with sentiments regarding AI leadership and the humanization of technology.

Long-Term AI Governance Emerging: Forward-looking conversations hint at Project Management Robots (PRMs) playing a key role in guiding long-term AI tasks, inspired by a talk by John Schulman.

Evaluating AI Evaluation: A detailed blog post stirred thoughts about the accessibility and future of large language model (LLM) evaluations, discussing tools ranging from MMLU benchmarks to A/B testing and its implications for academia and developers.


Eleuther Discord

MLP Might Take the Crown: There's a buzz about MLP-based models possibly overtaking Transformers in vision tasks, with a new hybrid approach presenting fierce competition. A specific study highlights the efficiency and scalability of MLPs, despite some doubts regarding their sophistication.

Getting the Initialization Right: Debate emerged on the criticality of initialization schemes in neural networks, especially for MLPs, with suggestions that innovation in initialization could unlock vast improvements. A notion was floated about creating initializations via Turing machines, exploring the frontier of synthetic weight generation as seen on Gwern's website.

Mimetic Initialization as a Game-Changer: A paper promoting mimetic initialization surfaced, advocating for this method as a boost for Transformers working with small datasets, resulting in greater accuracy and reduced training times, detailed in MLR proceedings.

Scalability Quest Continues: In-depth discussions tackled whether MLPs can surpass Transformers in terms of Model FLOPs Utilization on various hardware, hinting that even small MFU improvements could resonate across large scales.

Contemplating NeurIPS Contributions: A call was made for potential last-minute NeurIPS submissions, with one member citing interest in topics akin to the Othello paper. Another discussion queried the consequences of model compression on specialized features and their relation to training data diversity.


Modular (Mojo 🔥) Discord

New Sheriff in Town: Mojo Compiler Development Heats Up: Engineering discussions revealed keen interest in contributing to the Mojo compiler, though it's not yet open source. The compiler debate also unveiled that it's written in C++, with aspirations to rebuild MLIR in Mojo spark curiosity among contributors.

MLIR Makes Friends with Mojo: Integration features between Mojo and MLIR were dissected, highlighting how Mojo's compatibility with MLIR could lead to a self-hosting compiler in the future. Contributions to the Mojo Standard Library are now encouraged, with a how-to video from Modular engineer Joe Loser illuminating the process.

Cutting-Edge Calendars: Upcoming Mojo Community Meeting details were announced for May 20, with the aim to keep developers, contributors, and users engaged with Mojo's trajectory. A helpful meeting document and options to add events via a community meeting calendar were shared to coordinate.

Nighttime is the Right Time for Code: Nightly releases of mojo are now more frequent, a welcomely aggressive update schedule that aims at transforming nightly nightlies from dream to reality. However, a segfault issue in nested arrays remains controversial, and there's talked-about adjusting release frequency to avoid confusion over compiler versions among users.

Coding Conundrums and Compiler Conversations: Within the dusty digital hallways, developers tackled topics from how to restict parameters to float types in Mojo—advised to use dtype.is_floating_point()—to Python's mutable default parameters, and the use of FFI to call C/C++ libraries from Mojo. Further details were shared through a GitHub link on the subject of FFI in Mojo.


CUDA MODE Discord

ZeRO-1 Upscaling Amps Up Training Throughput: Implementing ZeRO-1 optimization increased per GPU batch size from 4 to 10 and improved training throughput by about 54%. Details about the merge and its effect can be reviewed on the PR page.

ThunderKittens Sparks Curiosity: Discussion included interest in HazyResearch/ThunderKittens, a CUDA tile primitives library, for its intriguing potential to optimize LLMs, drawing comparisons with Cutlass and Triton tools.

Triton Gains Through FP Enhancements: Updates to Triton included performance improvements with FP16 and FP8, as shown in benchmark data: "Triton [FP16]" achieved 252.747280 for N_CTX of 1024 and "Triton [FP8]" reached 506.930317 for N_CTX of 16384.

CUDA Streamlines, but Questions Remain: On integrating custom CUDA kernels in PyTorch, resources were shared, including a YouTube lecture addressing the basics, while issues like clangd parsing .cu files and function overhead in cuSPARSE were flagged.

Finessing CUDA CI Pipelines: The need for GPU testing in continuous integration was debated, promoting GitHub's latest GPU runner support in CI as a sought-after update for robust pipeline construction.


LlamaIndex Discord


LAION Discord


OpenInterpreter Discord

GPT-4 Outpaces its Predecessor: Enthusiasts within the community have noted that GPT-4o is not only faster, delivering at 100 tokens/sec, but also more cost-efficient than the previous iterations. There's particular interest in its integration with Open Interpreter, citing smooth functionality with the command interpreter --model openai/gpt-4o.

Llama Left in the Dust: After experiencing the performance of GPT-4, one member shared their dissatisfaction with Llama 3 70b, alongside concerns over the high costs associated with OpenAI, which tallied up to $20 in just one day.

Apple's Reticence Might Fuel Open-Source AI: Speculation abounds on whether Apple will integrate AI into MacOS, with some members doubtful and preferring open-source AI solutions, implying a potential uptick in Linux utilization among the community.

Awaiting O1's Next Flight: Anticipation is high for the upcoming TestFlight release of an unnamed project, with members sharing their advice and clarifications on setting up test environments and compiling projects in Xcode.

The March Toward AGI: A spirited discussion relating to the progress toward Artificial General Intelligence (AGI) has taken place, with participants exchanging thoughts and resources, including a Perplexity AI explanation that sheds light on this frontier.


LangChain AI Discord

ChatGPT's Wavering Convictions: Engineers noted that ChatGPT now sometimes contradicts itself, diverging from its former consistency in responses. Concerns were raised about the tool's reliability in maintaining a steady line of reasoning.

LangChain Troubleshooting Continues: Engineers have moved to from langchain_community.chat_models import ChatOpenAI after LLCHAIN deprecation, but face new challenges with streaming and sequential chains. The slow invocation time for LangChain agents, especially with large inputs, has led to discussions on the potential for parallel processing to alleviate processing times.

AI/ML GitHub Repos Get Spotlight: Favorite AI/ML GitHub repositories were exchanged, with projects like llama.cpp and deepspeed receiving mentions amongst the community.

Socket.IO Joins the Fray: An engineer contributed a guide on using python-socketio to stream LLM responses in realtime, demonstrating client-server communication to handle streaming and acknowledgments.

Show and Tell with AI Flair: Shared projects included a Medium article on Plug-and-Plai integrations, a multimodal chat app utilizing Streamlit and GPT-4o, a production-scaling query for a RAG application with ChromaDB, and a Snowflake cost monitoring and optimizer tool in development.

Chat Empowers Blog Interaction: A post discussing how to enable active conversations on blog content using Retrieval Augmented Generation (RAG) was shared, further fueling interest in integrating advanced AI chat features on websites.


OpenAccess AI Collective (axolotl) Discord

Blogging Platform Face-Off: Users debated the merits of Substack versus Bluesky for blogging needs, concluding that while Bluesky can support threads, it lacks comprehensive blogging features.

Reducing AI Compute Consumption: There's a focus on minimizing AI compute usage, with links shared to initiatives like Based and FlashAttention-2 that are paving the way to more efficient AI operations.

Dependency Dilemmas: Members are vexed by outdated dependencies, including peft 0.10.0 and others, and are adjusting them manually for compatibility, with a reluctant call for pull requests issued to rectify the situation.

CUDA Quandaries: A report surfaced about a member facing CUDA errors in an 8xH100 GPU environment, which was later mitigated by switching to a community axolotl cloud image.

QLoRA Model Mergers and Training Continuation: Queries and discussions arose about integrating QLoRA with base models without compromising precision. Additionally, conversations centered on the mechanics of resuming training from checkpoints using ReLoRACallback, as documented in the OpenAccess-AI-Collective axolotl repository.


Datasette - LLM (@SimonW) Discord

Voice Assistant Not All Giggles: Technical community is puzzled by the choice of a voice assistant's giggling feature, considering it inappropriate and distracting for professional use. Workarounds like rephrasing commands could tame this quirk.

Mixed Review on GPT-4o's Book Recognition Task: GPT-4o's ability to enumerate books displayed on a shelf received mixed criticism, securing only a 50% accuracy, which leaves room for improvement despite its commendable speed and competitive pricing.

AGI Hype Debated: Skepticism prevails over imminent Advanced General Intelligence (AGI), as diminishing returns are observed in the leap from GPT-3 to GPT-4, while GPT-5's buzz overshadows current model refinements.

Long-Term GPT-4 Impact Still Foggy: Long-term predictions for impacts of GPT-4 and its iterations remain speculative, with the engineering community still exploring their full spectrum of capabilities.

Simon Tweets LLM Insights: Simon W's Twitter update could be a potent catalyst for conversation about the latest developments and challenges in large language models.


tinygrad (George Hotz) Discord


DiscoResearch Discord


Cohere Discord

Community Awaits Support: Users in the Cohere guild reported delays in receiving support responses, with one user reaching out in <#1168411509542637578> and <#1216947664504098877> to voice this issue. A response promised active support staff, requesting more details to assist.

Command R RAG Grabs Limelight: An engineer was "extremely impressed" by Command R's RAG (Retriever-Augmented Generation) capabilities, touting its cost-effectiveness, precision, and fidelity even with lengthy source materials.

Collaboration Call in Project Sharing: The #project-sharing channel saw a member, Vedang, express interest in teaming up with another engineer, Asher, on a similar project, underlining the community's collaborative spirit.

Members Spread Their Medium Influence: Amit circulated a Medium article that dives into using RAG via the Unstructured API, aimed at structuring content extractions from PDFs—potentially useful for engineers working with document processing.

Emoji Greetings Dismissed as Noise: Casual exchanges of greetings and emojis like "<:hammy:981331896577441812>" were deemed non-essential and omitted from the professional engineering discourse of the guild.


LLM Perf Enthusiasts AI Discord


Skunkworks AI Discord


Alignment Lab AI Discord


AI Stack Devs (Yoko Li) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

OpenAI ▷ #annnouncements (2 messages):


OpenAI ▷ #ai-discussions (1085 messages🔥🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (261 messages🔥🔥):


OpenAI ▷ #prompt-engineering (51 messages🔥):


OpenAI ▷ #api-discussions (51 messages🔥):


OpenAI ▷ #api-projects (2 messages):


Perplexity AI ▷ #general (993 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (9 messages🔥):


Perplexity AI ▷ #pplx-api (4 messages):


Unsloth AI (Daniel Han) ▷ #general (622 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (37 messages🔥):

Link mentioned: Ah Shit Here We Go Again Gta GIF - Ah Shit Here We Go Again Gta Gta Sa - Discover & Share GIFs: Click to view the GIF


Unsloth AI (Daniel Han) ▷ #help (283 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (1 messages):

Link mentioned: Artificial Intelligence in the Name of Cthulhu – Rasmus Rasmussen dot com: no description found


Latent Space ▷ #ai-general-chat (114 messages🔥🔥):

Links mentioned:


Latent Space ▷ #llm-paper-club-west (710 messages🔥🔥🔥):

Links mentioned:


Nous Research AI ▷ #ctx-length-research (1 messages):

king.of.kings_: i am struggling to get llama 3 70b to be coherent over 8k tokens lol


Nous Research AI ▷ #off-topic (27 messages🔥):

Link mentioned: Hello GPT-4o Openai's latest and best model: We will take a look at announcing GPT-4o, open ai's new flagship model that can reason across audio, vision, and text in real time.https://openai.com/index/h...


Nous Research AI ▷ #interesting-links (3 messages):

Links mentioned:


Nous Research AI ▷ #general (726 messages🔥🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (15 messages🔥):

Links mentioned:


Nous Research AI ▷ #bittensor-finetune-subnet (2 messages):


Nous Research AI ▷ #rag-dataset (2 messages):

Link mentioned: InstructLab: InstructLab has 10 repositories available. Follow their code on GitHub.


Nous Research AI ▷ #world-sim (22 messages🔥):


Stability.ai (Stable Diffusion) ▷ #general-chat (450 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #💬-general (205 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (62 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (6 messages):


LM Studio ▷ #🎛-hardware-discussion (12 messages🔥):


LM Studio ▷ #🧪-beta-releases-chat (2 messages):


LM Studio ▷ #amd-rocm-tech-preview (1 messages):


LM Studio ▷ #🛠-dev-chat (17 messages🔥):


HuggingFace ▷ #general (235 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (4 messages):

Links mentioned:


HuggingFace ▷ #cool-finds (8 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (7 messages):

Links mentioned:


HuggingFace ▷ #reading-group (6 messages):

Links mentioned:


HuggingFace ▷ #computer-vision (28 messages🔥):

Links mentioned:


HuggingFace ▷ #NLP (1 messages):

Link mentioned: Building a new tokenizer: Learn how to use the 🤗 Tokenizers library to build your own tokenizer, train it, then how to use it in the 🤗 Transformers library.This video is part of the...


HuggingFace ▷ #diffusion-discussions (16 messages🔥):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (3 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

Link mentioned: Rubik's AI - AI research assistant & Search Engine: no description found


OpenRouter (Alex Atallah) ▷ #general (278 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (178 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (3 messages):

Link mentioned: PPO / Reinforce Trainers by vwxyzjn · Pull Request #1540 · huggingface/trl: This RP supports the REINFORCE RLOO trainers in https://arxiv.org/pdf/2402.14740.pdf. Note that REINFORCE's loss is a special case of PPO, as shown below it matches the REINFORCE loss presented i...


Interconnects (Nathan Lambert) ▷ #random (20 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #reads (5 messages):


Eleuther ▷ #general (30 messages🔥):

Links mentioned:


Eleuther ▷ #research (36 messages🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (119 messages🔥🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (4 messages):


Eleuther ▷ #gpt-neox-dev (1 messages):

oleksandr07173: Hello


Modular (Mojo 🔥) ▷ #general (29 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):


Modular (Mojo 🔥) ▷ #📺︱youtube (4 messages):

Links mentioned:


Modular (Mojo 🔥) ▷ #announcements (1 messages):

Links mentioned:


Modular (Mojo 🔥) ▷ #🔥mojo (77 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #nightly (27 messages🔥):

Link mentioned: [CI] Add timeouts to workflows by JoeLoser · Pull Request #2644 · modularml/mojo: On Ubuntu tests, we're seeing some non-deterministic timeouts due to a code bug (either in compiler or library) from a recent nightly release. Instead of relying on the default GitHub timeout of ...


CUDA MODE ▷ #triton (13 messages🔥):

Link mentioned: [TUTORIALS] tune flash attention block sizes (#3892) · triton-lang/triton@702215e: no description found


CUDA MODE ▷ #cuda (7 messages):

Link mentioned: cccl/.clangd at main · NVIDIA/cccl: CUDA C++ Core Libraries. Contribute to NVIDIA/cccl development by creating an account on GitHub.


CUDA MODE ▷ #torch (10 messages🔥):

Link mentioned: whisper/whisper/model.py at main · openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision - openai/whisper


CUDA MODE ▷ #beginner (6 messages):

Link mentioned: Lecture 3: Getting Started With CUDA for Python Programmers: Recording on Jeremy's YouTube https://www.youtube.com/watch?v=nOxKexn3iBoSupplementary Content: https://github.com/cuda-mode/lecture2/tree/main/lecture3Speak...


CUDA MODE ▷ #pmpp-book (6 messages):

Link mentioned: Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


CUDA MODE ▷ #off-topic (1 messages):

shikhar_7985: found an old one from the internet's basement


CUDA MODE ▷ #triton-puzzles (2 messages):


CUDA MODE ▷ #llmdotc (89 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #lecture-qa (2 messages):


LlamaIndex ▷ #blog (6 messages):

Links mentioned:


LlamaIndex ▷ #general (104 messages🔥🔥):

- **Metadata in `query` method leaves users confused**: A member questioned if metadata must be passed during the `query` method after embedding it in `TextNode`. Clarifications revealed that **metadata filtering** can be handled internally by LlamaIndex, but any specific usage like URLs must be added manually.
- **Unexpected token error in frontend response**: A user faced an issue where the frontend stops outputting the AI's response mid-message, displaying `Unexpected token U`. It was suggested to inspect the actual response in the network tab or manually `console.log` the response before parsing.
- **Error handling with Qdrant vectors and postprocessors**: A user's attempt to create a new postprocessor with Qdrant vector store met with a `ValidationError`: expected `BaseDocumentStore`. The solution involved correctly identifying and passing vector storage within the proper context.
- **Confusion about LlamaIndex implementation updates**: Members discussed updating the sec-insights repo and LlamaIndex from 0.9.7 to newer versions. Suggesting it may involve mostly updating imports, as noted by a member willing to assist with the version upgrade changes.
- **Job search assistant using LlamaIndex**: An article on building a job search assistant with LlamaIndex and MongoDB was shared, offering a detailed tutorial and project repository. The project aims to enhance the job search experience using AI-driven chatbots and **Retrieval-Augmented Generation**.

Links mentioned:


LAION ▷ #general (101 messages🔥🔥):

Links mentioned:


LAION ▷ #research (3 messages):

Link mentioned: Veo: Veo is our most capable video generation model to date. It generates high-quality, 1080p resolution videos that can go beyond a minute, in a wide range of cinematic and visual styles.


OpenInterpreter ▷ #general (52 messages🔥):


OpenInterpreter ▷ #O1 (18 messages🔥):


LangChain AI ▷ #general (47 messages🔥):

Link mentioned: Issues · langchain-ai/langchain: 🦜🔗 Build context-aware reasoning applications. Contribute to langchain-ai/langchain development by creating an account on GitHub.


LangChain AI ▷ #langchain-templates (1 messages):


LangChain AI ▷ #share-your-work (5 messages):

Links mentioned:


LangChain AI ▷ #tutorials (2 messages):

Link mentioned: Build a RAG pipeline for your blog with LangChain, OpenAI and Pinecone: You can chat with my writing and ask me questions I've already answered even when I'm not around


OpenAccess AI Collective (axolotl) ▷ #general (24 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (8 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #general-help (2 messages):


OpenAccess AI Collective (axolotl) ▷ #runpod-help (1 messages):


OpenAccess AI Collective (axolotl) ▷ #axolotl-help-bot (1 messages):


OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (7 messages):

Link mentioned: OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.


Datasette - LLM (@SimonW) ▷ #ai (29 messages🔥):


Datasette - LLM (@SimonW) ▷ #llm (1 messages):

simonw: https://twitter.com/simonw/status/1790121870399782987


tinygrad (George Hotz) ▷ #learn-tinygrad (24 messages🔥):

Links mentioned:


DiscoResearch ▷ #general (17 messages🔥):

Links mentioned:


Cohere ▷ #general (8 messages🔥):


Cohere ▷ #project-sharing (2 messages):


LLM Perf Enthusiasts AI ▷ #general (6 messages):


LLM Perf Enthusiasts AI ▷ #gpt4 (3 messages):

Link mentioned: Introducing GPT-4o: OpenAI Spring Update – streamed live on Monday, May 13, 2024. Introducing GPT-4o, updates to ChatGPT, and more.


Skunkworks AI ▷ #announcements (1 messages):


Skunkworks AI ▷ #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=9pHyH4XDAYk


Alignment Lab AI ▷ #fasteval-dev (1 messages):


AI Stack Devs (Yoko Li) ▷ #paper-spam (1 messages):

angry.penguin: nice, AK is back