Frozen AI News archive

Cohere Command R+, Anthropic Claude Tool Use, OpenAI Finetuning

**Cohere** launched **Command R+**, a **104B dense model** with **128k context length** focusing on **RAG**, **tool-use**, and **multilingual** capabilities across **10 key languages**. It supports **Multi-Step Tool use** and offers open weights for research. **Anthropic** introduced **tool use in beta** for **Claude**, supporting over **250 tools** with new cookbooks for practical applications. **OpenAI** enhanced its fine-tuning API with new upgrades and case studies from Indeed, SK Telecom, and Harvey, promoting DIY fine-tuning and custom model training. **Microsoft** achieved a quantum computing breakthrough with an **800x error rate improvement** and the most usable qubits to date. **Stability AI** released **Stable Audio 2.0**, improving audio generation quality and control. The **Opera browser** added local inference support for large language models like **Meta's Llama**, **Google's Gemma**, and **Vicuna**. Discussions on Reddit highlighted **Gemini's large context window**, analysis of **GPT-3.5-Turbo** model size, and a battle simulation between **Claude 3** and **ChatGPT** using local 7B models like **Mistral** and **Gemma**.

Canonical issue URL

Busy day today.

  1. The at least $500m richer Cohere launched a fast-follow of last month's Command R with Command R+ (official blog, weights). It's a 104B dense model with 128k context length focused on RAG, tool-use, and multilingual ("10 key languages")) usecases. Open weights for research but Aidan says "just reach out" if you want to license it (instead of paying their $3/$15 per mtok pricing). It now supports Multi-Step Tool use. image.png
  2. The $2.75B richer Anthropic launched tool use in beta as previously promised (official docs). The extensive docs come with a number of notable features, most notably advertising the ability to handle over 250 tools which enables a very different function calling architecture than before. This is presumably due to context length and recall improvements in the past year. For more details see their 3 new cookbooks:
  1. OpenAI, which hasn't raised anything in the last month (that we know of), added a bunch of very welcome upgrades to the very MVPish finetuning experience together with 3 case studies with Indeed, SK Telecom, and Harvey that basically say "you can now DIY better but also we are open for business to finetune and train your stuff".

image.png


Table of Contents

[TOC]


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence. Comment crawling still not implemented but coming soon.

AI Technology Advancements

Model Capabilities & Comparisons

AI Research & Education

AI Tools & Applications

AI Memes & Humor

AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

Cohere Command R+ Release

DALL-E 3 Inpainting Release

Mixture-of-Depths for Efficient Transformers

RAG and Agent Developments

Open-Source Models and Frameworks

Memes and Humor


AI Discord Recap

A summary of Summaries of Summaries

  1. LLM Advancements and Integrations:

    • Cohere unveils Command R+, a 104B parameter multilingual LLM optimized for enterprise use with advanced Retrieval Augmented Generation (RAG) and multi-step tool capabilities, sparking interest in its performance compared to other models.
    • JetMoE-8B represents an affordable milestone at under $0.1 million cost, surpassing Meta AI's LLaMA2 performance using only 2.2B active parameters.
    • Discussions around integrating LLMs like HQQ with gpt-fast, exploring 4/3 bit quantization approaches like the Mixtral-8x7B-Instruct quantized model.
  2. Optimizing LLM Inference and Training:

  3. LLM Evaluation and Benchmarking:

  4. Open-Source AI Frameworks and Tools:

    • LlamaIndex unveils cookbooks guiding RAG system building with MistralAI, including routing and query decomposition.
    • Koyeb enables effortless global scaling of LLM apps by connecting GitHub repos to deploy serverless apps.
    • SaladCloud offers a managed container service for AI/ML workloads to avoid high cloud costs.
    • The transformer-heads GitHub repo provides tools for extending LLM capabilities by attaching new model heads.

PART 1: High level Discord summaries

Stability.ai (Stable Diffusion) Discord


Perplexity AI Discord

API Excitement Meets Payment Puzzles: A perplexing payment issue cropped up for a Perplexity API user whose transaction was stuck as "Pending" without updating the account balance. Meanwhile, discussions revolved around the potential of APIs and the choice between a Pro Subscription and pay-as-you-go API, with opinions favoring the subscription for initial business ideation due to cost predictability.

Model Mashing Madness: Users dived into model preferences, favoring a balance between a larger message count and an adequate context window. They also tackled the challenge of model limitations with complex programming languages like Rust and custom "metaprompt" strategies for structured output.

Content Sharing Caveat: A note was made to ensure threads are set to shareable when posting content on Discord, facilitating wider community engagement.

Thirst for Source Links in Sonar Model: Inquiries were made concerning the sonar-medium-online model's ability to return source links with data, but a definitive timeline on the feature's implementation remains elusive.

LLM Leaderboard Quirks and Queries: The LLM Leaderboard sparked an analytical discourse on model rankings with a dash of humor over model name mishaps, pointing to the significance of clarity in system prompts for better AI performance.


OpenAI Discord


Unsloth AI (Daniel Han) Discord

CmdR Set to Join the Unsloth Ranks: The addition of CmdR support to Unsloth AI is in progress, with the community eagerly awaiting its integration post current task completions. The anticipation ties into plans for an open-source CPU optimizer, slated for reveal on April 22, to enhance AI model accessibility for those with limited GPU resources.

Interfacing Innovation with Continue's Autocomplete: A new tab autocomplete feature is in experimental pre-release for the Continue extension, designed to streamline coding in VS Code and JetBrains by consulting language models directly within the dev environment.

Error Extermination and Optimization Dialogues: AI engineers shared solutions to naming-related tokenizer errors, and discussed model.save_pretrained_merged and model.push_to_hub_merged functions for seamless model saving and sharing on Huggingface. Despite encountering AttributeError in GemmaForCausalLM, users were directed to update Unsloth for resolution.

Stumbling Blocks in Saving and Server-Side Setup: Users navigated challenges with GGUF conversions and Docker setups, tackling issues like python3.10-dev dependencies and workaround strategies for memory errors during finetuning on different platforms.

Diving into Unsloth Studio's Next Iteration Soon: An update on Unsloth Studio's release push is set for mid next month due to current bug fixes, ensuring ongoing compatibility with Google Colab alongside improvements for developers leveraging the Studio's capabilities.


Latent Space Discord

Stable Audio Hits a New High: Stability AI launched Stable Audio 2.0, enabling creation of lengthy high-quality music tracks utilizing a single prompt. Visit stableaudio.com to test the model and find further details in their blog post.

AssemblyAI Outperforms Whisper: AssemblyAI announced Universal-1, a speech recognition model surpassing Whisper-3 by achieving 13.5% better accuracy and demonstrating up to 30% decrease in hallucinations. The model processes an hour of audio in a mere 38 seconds and is available for trial at AssemblyAI's playground.

Enhance Your Images with ChatGPT Plus: Users of ChatGPT Plus now possess the ability to modify DALL-E-generated images and prompts, available on both web and iOS platforms. Full guidance on usage is provided in their help article.

AI Agents as Scalable Microservices: Discussions focused on utilizing event-driven architecture to build scalable AI agents, with the Actor Model cited as an inspiration, and a Golang framework presented for collaborative feedback.

Opera One Downloads AI Directly: Opera integrates the ability for users to run large language models (LLMs) locally, beginning with Opera One on the developer stream, harnessing the Ollama framework, as detailed by TechCrunch.

DSPy Steals the Spotlight: Members evaluated DSPy's performance in optimizing prompts for foundation models, focusing on model migration and optimization while being cautious of API rate limits. A detailed study of Devin surfaced numerous AI project opportunities, with keen interest in diverse applications ranging from voice-integrated iOS apps to documentation overhaul initiatives.


Nous Research AI Discord


Modular (Mojo 🔥) Discord

Mojo on the Move: Engineers shared that Mojo now runs on Android devices like Snapdragon 685 CPUs and discussed integrating Mojo with ROS 2, accentuating Mojo's memory safety over Python, particularly in robotics where Python’s GIL limits Nvidia Jetson hardware performance.

Performance Breakthroughs and Best Practices: Significant library performance improvements were noted, dropping execution times to minutes, beating previous Golang benchmarks. Methods such as pre-setting dictionary capacities for optimization were advised, and designers of specialized sorting algorithms for strings are encouraged to align with Mojo’s latest versions, seen at mzaks/mojo-sort.

From Parser to FASTQ: BlazeSeq🔥, a new feature-complete FASTQ parser, has been introduced, providing a CLI-compatible parser that conforms to BioJava and Biopython benchmarks. Enhanced file handling is promised by the buffered line iterator they implemented, indicating a move to a robust future standard for file interactions, showcased on GitHub.

Mojo Merger Madness: Innovative ideas on model merging and conditional conformance in Mojo used @conditional annotations for optional trait implementations, while merchandise ideas like Mojo-themed plushies stirred community excitement. Memory management optimizations were considered, examining potential changes to how Optional returns values in the nightly version of Mojo's standard library.

Modular Updates Galore: Max⚡ and Mojo🔥 24.2 release brings open-sourced standard libraries and nightly builds with community contribution. Docker build issues in version 24.3 are addressed, while continued development discussions recommend conditional conformance and error handling strategies for future roadmap considerations.


LM Studio Discord

Bold Boosts with ROCm: AMD hardware sees a massive increase in speed from 13 to 65 tokens/second when engineered with the ROCm preview, highlighting the significant potential of the right software interface for AMD GPUs.

Mixtral, Not a Mistral Mistake: Mixtral's distinct identity as a MOE model, combining the strength of eight 7B models into a 56B powerhouse, reflects a strategic approach unlike the standard Mistral 7B. Meanwhile, running a Mixtral 8x7b on a 24GB VRAM NVIDIA 3090 GPU may hit speed snafus, yet it’s a viable venture.

LM Studio 0.2.19 Courts Embeddings: The fresh-out-of-the-lab LM Studio version 0.2.19 Preview 1 now supports local embedding models, opening up new possibilities for AI practitioners. Despite lacking ROCm support in its current preview, Windows, Linux, and Mac users can grab their respective builds from the provided links.

Engineers Tackle Odd Model Behavior: Discourse on an AI model dishing out bizarre, task-unrelated responses uncovers potential mishaps in the model's training, signaling a programming predicament in need of debugging prowess.

CrewAI Collision with JSONDecodeError: Encountering a JSONDecodeError using CrewAI suggests a potential misstep in JSON formatting, a puzzle piece that AI engineers must properly place to avoid jeopardizing data parsing processes.


Eleuther Discord

Transformers Takeover at Stanford: The Stanford CS25 seminar on Transformers is open to the public for live audits and recorded sessions, with industry experts leading the discussions on LLM architectures and applications. Interested individuals can participate via Zoom, access the course website, or watch recordings on YouTube.

Skeptical About Efficiency Claims: The community voiced skepticism about the Free-pipeline Fast Inner Product (FFIP) algorithm's performance claims, noted in a journal publication, which promises efficiency by halving multiplications in AI hardware architectures.

CUDA Conundrums and Code Conflicts: A member troubleshooting a RuntimeError with CUDA identified apex as the issue when using the LM eval harness on H100 GPUs, recommending upgrades to CUDA 11.8 and other adjustments for stability.

Next-Gen AI Training Techniques Touted: An arXiv paper introduces dynamic FLOP allocation in transformers, potentially optimizing performance by diverging from uniform distribution. Additionally, cloud services like AWS and Azure support advanced training schemes, with AWS's Gemini mentioned explicitly.

Elastic and Fault-Tolerant How-To: Details on establishing fault-tolerant and elastic job launches with PyTorch were shared, with documentation available at the PyTorch elastic training quickstart guide.


LAION Discord

In the research domain:


OpenAccess AI Collective (axolotl) Discord

Patch Perfect: A noteworthy GitHub bug was swiftly eradicated in the OpenAccess AI Collective's axolotl repository, with the commit history accessible via GitHub Commit 5760099. Meanwhile, a README Table of Contents mismatch was flagged, prompting a cleanup.

Datasets and Model Dialogues: Queries about optimal datasets for training Mistral 7B models led to a recommendation for the OpenOrca dataset, while debates on fine-tuning practices leaned towards the strategy of prioritizing 'completion' before 'instructions'. Discussions spotlighted the potency of simple fine-tuning (SFT) over continual pre-training (CPT) when armed with high-quality instructional samples.

Bot-tled Service: The Axolotl help bot hit a snag, going offline and sparking a wave of mirthful member reactions, yet specifics behind the incident weren't disclosed. The bot was previously offering guidance on the integration of Qwen2 with Qlora and addressing challenges related to dataset streaming and multi-node fine-tuning within Docker environments.

AI Dialogues: The Collective's general channel buzzed with tech talk—from rapid model feedback services like Chaiverse to the novel resources for adding heads to Transformer models found in the GitHub repository for transformer-heads. CohereForAI unveiled a behemoth 104 billion parameter C4AI Command R+ model with specialized capabilities revealed on Hugging Face, stirring conversations about the financial implications of running massive models.

Infrastructure Innovations: SaladCloud's recent launch of a fully-managed container service for AI/ML workloads was recognized as a notable entrance, giving developers an edge against sky-high cloud costs and GPU shortages with affordable rates for inference at scale.


LlamaIndex Discord

AI Spellcheck Gets Real: A Node.js code shared by a member for correcting spelling mistakes using the LlamaIndex Ollama package showed an AI model named ‘mistral’ fixing user errors, like "bkie" to "bike," which can run locally without third-party services over localhost:11434.

Llama's Culinary Code-Loaded Cookbook: A new culinary-themed guidebook series is unveiled for AI enthusiasts, demonstrating how to build RAG, agentic RAG, and agent-based systems with MistralAI, including routing and query decomposition. Grab your AI recipes here.

Exploration and Confusion in LlamaIndex: Discussions in the community raised concerns about issues from lacking knowledgegraphs pipeline support to unclear graphindex and graphdb integrations, and several members struggled with querying OpenSearch and implementing ReAct agents in llama_index.

AI Discussion Evolves Beyond Text: Engaging talks emerged about the potential of enhancing image processing with Reading and Asking Generative (RAG) techniques, discussing applications ranging from CAPTCHA solutions to ensuring continuity in visual narratives like comics.

Scaling AI Deployment Made Convenient: Koyeb's platform was highlighted for effortlessly scaling LLM applications, directly connecting your GitHub repo to deploy serverless apps globally without managing infrastructure. Check out the service here.


HuggingFace Discord

Bold Repo Visibility Choices: HuggingFace has introduced settings for default repository visibility with options for public, private, or private-by-default for enterprises. The functionality is described in this tweet by Julien Chaumond.

Custom Quarto Publishing: HuggingFace now supports publishing with Quarto, as detailed in a tweet by Gordon Shotwell, with more information available on LinkedIn.

Summarization Struggles and Strategies: Users across channels discussed summarization challenges with GPT-2 and Hugging Face's pipeline, including ineffective length penalties and the search for prompt crafting that maximizes efficiency and result quality, even in CPU-only environments.

Innovations and Interactions in AI Circles: Excitement was shared for projects including Octopus 2, a model capable of function calls, and advancements in image processing with the new multi-subject image node pack from Salt. The community also highlighted academic discussions and resources, such as the potential of RAG for interviews and latency-reasoning trade-offs in production prompts, shared in Siddish's tweet.

Diffusion Model Dialogue Deliberates Depth: AI engineers explored creative implementations for diffusion models, discussing DiT with cross-attention for various data conditions, and considering Stable Diffusion modifications for tasks like stereo to depth map conversion, referring to the DiT paper and resources like Dino v2 GitHub and SD-Forge-LayerDiffuse GitHub.


tinygrad (George Hotz) Discord

Fishing for Compliments or Functionality?: Discord's switch from the whimsical fish logo to a more polished design sparked debate among members, leading to talks to potentially match the banner to the new aesthetic. The logo changes by George Hotz seem to have left some nostalgic for the old one.

Sharding Optimizations In-Depth: George Hotz and community members explored optimization techniques and cross-GPU communications, facing challenges with launch latencies and data transfers. They examined the use of cudagraphs, peer-to-peer limitations, and the role of NV drivers.

Tinygrad Performance Milestone: Sharing performance benchmarks, it was revealed that Tinygrad achieved 53.4 tokens per second on a single 4090 GPU, marking 83% efficiency compared to gpt-fast. George Hotz indicated goals to further boost Tinygrad's performance.

Intel Hardware On The Horizon: Discussions on Intel GPU and NPU kernel drivers scrutinized various available drivers like 'gpu/drm/i915' and 'gpu/drm/xe', with anticipation for the performance and power efficiency that NPUs may bring when paired with CPUs.

Helpful Neural Net Education Hustle: The community found the Tinygrad tutorials to be a valuable starting point for neural network newbies and also recommended the JAX Autodidax tutorial, complete with a hands-on Colab notebook. Interest surged in adapting ColabFold or OmegaFold for Tinygrad, while also learning about PyTorch weight transfer methods.


OpenRouter (Alex Atallah) Discord


OpenInterpreter Discord


Interconnects (Nathan Lambert) Discord


Mozilla AI Discord


LangChain AI Discord

Crypto Chatbot Craze Calls for Coders: An individual is in search of developers with LLM training expertise to create a chatbot simulating human conversation, utilizing real-time crypto market data. The aim is to enable nuanced discussions reflecting the latest market shifts.

Math Symbol Extraction Without MathpixPDFLoader: Alternatives to MathpixPDFLoader for extracting math symbols from PDFs are in demand, as users seek new methods to handle this specific task effectively.

LangChain LCEL Logic Lessons: A discussion clarified the use of the '|' operator in LangChain's Expression Language (LCEL), which chains components like prompts and LLM outputs into complex sequences. The intricacies are further explored in Getting Started with LCEL.

Voice Apps Vocalizing AI Capabilities: Newly launched voice applications such as CallStar are prompting discussions around their interactivity and setup, powered by technologies like RetellAI, with community support via Product Hunt and Reddit platforms.

LangChain Quickstart Walkthrough Woes: Sharing the LangChain Quickstart Guide, a user provided example code for integrating LangChain with OpenAI, yet faced a NotFoundError indicating a missing resource. The community's technical acumen is requested to troubleshoot this setback.


CUDA MODE Discord


Datasette - LLM (@SimonW) Discord

A New Approach to AI Dialogues: Reflecting on conversational AI terminology, a guild member suggested "turns" as a better descriptor than "responses" for the initial message in a dialogue, a decision fueled by the exploration of a logs.db database and resulting in the serendipitous pun with the term database 'turns'.

AI Product Evaluations Get a Thumbs Up: Guild members rallied around the importance of Hamel Husain's post on AI evaluations, which outlines strategies for creating domain-specific evaluation systems for AI and is considered potentially groundbreaking for new ventures.

SQL Query Assistant Plugin Eyes Transparency and Control: There's a pitch for making the evaluations of the Datasette SQL query assistant plugin visible and editable, aiming to enhance user interaction and control over the evaluation process.

Perusing the Future of Prompt Management: A debate is brewing over the best practices for AI prompt management, with potential patterns including localization, middleware, and microservices, suggesting different methods for integrating AI into larger systems.

High-Resolution API Details Exemplified: The Cohere LLM search API’s detailed JSON responses were spotlighted, providing an example of the granularity that can benefit AI developers, as demonstrated in a shared GitHub comment.


DiscoResearch Discord


Skunkworks AI Discord


PART 2: Detailed by-Channel summaries and links

Stability.ai (Stable Diffusion) ▷ #general-chat (910 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #general (756 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (15 messages🔥):


Perplexity AI ▷ #pplx-api (42 messages🔥):

Links mentioned:


OpenAI ▷ #annnouncements (2 messages):

Link mentioned: Introducing improvements to the fine-tuning API and expanding our custom models program: We’re adding new features to help developers have more control over fine-tuning and announcing new ways to build custom models with OpenAI.


OpenAI ▷ #ai-discussions (494 messages🔥🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (46 messages🔥):


OpenAI ▷ #prompt-engineering (27 messages🔥):


OpenAI ▷ #api-discussions (27 messages🔥):


Unsloth AI (Daniel Han) ▷ #general (306 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (5 messages):

Link mentioned: Continue - Claude, CodeLlama, GPT-4, and more - Visual Studio Marketplace: Extension for Visual Studio Code - Open-source autopilot for software development - bring the power of ChatGPT to your IDE


Unsloth AI (Daniel Han) ▷ #help (248 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (86 messages🔥🔥):

Links mentioned:


Latent Space ▷ #llm-paper-club-west (356 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ctx-length-research (2 messages):


Nous Research AI ▷ #off-topic (9 messages🔥):


Nous Research AI ▷ #interesting-links (10 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (150 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (58 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (2 messages):


Nous Research AI ▷ #bittensor-finetune-subnet (2 messages):


Nous Research AI ▷ #rag-dataset (34 messages🔥):

Links mentioned:


Nous Research AI ▷ #world-sim (96 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (16 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (4 messages):


Modular (Mojo 🔥) ▷ #ai (4 messages):

Link mentioned: GitHub - ros2-rust/ros2_rust: Rust bindings for ROS 2: Rust bindings for ROS 2 . Contribute to ros2-rust/ros2_rust development by creating an account on GitHub.


Modular (Mojo 🔥) ▷ #tech-news (3 messages):


Modular (Mojo 🔥) ▷ #🔥mojo (277 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-projects (3 messages):

Link mentioned: GitHub - MoSafi2/BlazeSeq: Contribute to MoSafi2/BlazeSeq development by creating an account on GitHub.


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (11 messages🔥):


Modular (Mojo 🔥) ▷ #📰︱newsletter (2 messages):

Link mentioned: Modverse Weekly - Issue 28: Welcome to issue 28 of the Modverse Newsletter covering Featured Stories, the Max Platform, Mojo, & Community Activity.


Modular (Mojo 🔥) ▷ #nightly (17 messages🔥):

Links mentioned:


LM Studio ▷ #💬-general (171 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (39 messages🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (8 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (23 messages🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (19 messages🔥):

Links mentioned:


LM Studio ▷ #autogen (1 messages):


LM Studio ▷ #langchain (1 messages):


LM Studio ▷ #amd-rocm-tech-preview (9 messages🔥):


LM Studio ▷ #crew-ai (24 messages🔥):


Eleuther ▷ #general (194 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (51 messages🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (7 messages):

Links mentioned:


Eleuther ▷ #lm-thunderdome (17 messages🔥):

Link mentioned: Google Colaboratory: no description found


Eleuther ▷ #gpt-neox-dev (3 messages):

Link mentioned: Quickstart — PyTorch 2.2 documentation: no description found


LAION ▷ #general (158 messages🔥🔥):

Links mentioned:


LAION ▷ #research (10 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (67 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (4 messages):

Link mentioned: fix toc · OpenAccess-AI-Collective/axolotl@5760099: no description found


OpenAccess AI Collective (axolotl) ▷ #general-help (11 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #datasets (2 messages):


OpenAccess AI Collective (axolotl) ▷ #announcements (1 messages):


OpenAccess AI Collective (axolotl) ▷ #axolotl-help-bot (62 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #announcements (1 messages):

jerryjliu0: webinar is in 15 mins! ^^


LlamaIndex ▷ #blog (6 messages):

Link mentioned: IKI AI – Intelligent Knowledge Interface: Smart library and
 Knowledge Assistant for professionals and teams.


LlamaIndex ▷ #general (112 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (6 messages):


HuggingFace ▷ #announcements (3 messages):


HuggingFace ▷ #general (48 messages🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

Link mentioned: Tweet from Siddish (@siddish_): stream with out reasoning -> dumb response 🥴 stream till reasoning -> slow response 😴 a small LLM hack: reason most likely scenarios proactively while user is taking their time


HuggingFace ▷ #cool-finds (5 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (20 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (5 messages):


HuggingFace ▷ #computer-vision (8 messages🔥):


HuggingFace ▷ #NLP (21 messages🔥):


HuggingFace ▷ #diffusion-discussions (11 messages🔥):


tinygrad (George Hotz) ▷ #general (86 messages🔥🔥):

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (23 messages🔥):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (83 messages🔥🔥):

Links mentioned:


OpenInterpreter ▷ #general (17 messages🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (55 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (37 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (3 messages):


Interconnects (Nathan Lambert) ▷ #random (21 messages🔥):

Links mentioned:


Mozilla AI ▷ #llamafile (57 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #general (36 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (2 messages):

Links mentioned:


LangChain AI ▷ #langchain-templates (1 messages):


LangChain AI ▷ #share-your-work (5 messages):

Links mentioned:


LangChain AI ▷ #tutorials (1 messages):

Link mentioned: Quickstart | 🦜️🔗 Langchain: In this quickstart we'll show you how to:


CUDA MODE ▷ #triton (3 messages):

Links mentioned:


CUDA MODE ▷ #torch (4 messages):

Link mentioned: keras-benchmarks/benchmark/torch_utils.py at main · haifeng-jin/keras-benchmarks: Contribute to haifeng-jin/keras-benchmarks development by creating an account on GitHub.


CUDA MODE ▷ #algorithms (1 messages):

iron_bound: : insert rant about repeatability in science here :


CUDA MODE ▷ #beginner (3 messages):

Link mentioned: CUDA MODE: A CUDA reading group and community https://discord.gg/cudamode Supplementary content here https://github.com/cuda-mode Created by Mark Saroufim and Andreas Köpf


CUDA MODE ▷ #ring-attention (1 messages):

Link mentioned: GitHub - OpenNLPLab/LASP: Linear Attention Sequence Parallelism (LASP): Linear Attention Sequence Parallelism (LASP). Contribute to OpenNLPLab/LASP development by creating an account on GitHub.


CUDA MODE ▷ #hqq (19 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton-viz (13 messages🔥):


Datasette - LLM (@SimonW) ▷ #ai (17 messages🔥):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #llm (1 messages):


DiscoResearch ▷ #benchmark_dev (10 messages🔥):

Links mentioned:


DiscoResearch ▷ #discolm_german (3 messages):

Links mentioned:


Skunkworks AI ▷ #general (2 messages):

Link mentioned: Mixture-of-Depths: Dynamically allocating compute in transformer-based language models: Transformer-based language models spread FLOPs uniformly across input sequences. In this work we demonstrate that transformers can instead learn to dynamically allocate FLOPs (or compute) to specific ...


Skunkworks AI ▷ #finetuning (1 messages):


Skunkworks AI ▷ #papers (1 messages):

carterl: https://arxiv.org/abs/2404.02684


Alignment Lab AI ▷ #general-chat (1 messages):

jinastico: <@748528982034612226>