Frozen AI News archive

Not much happened today

**RAGFlow** open sourced, a deep document understanding RAG engine with **16.3k context length** and natural language instruction support. **Jamba v0.1**, a **52B parameter** MoE model by Lightblue, released but with mixed user feedback. **Command-R** from **Cohere** available on Ollama library. Analysis of **GPT-3.5-Turbo** architecture reveals about **7 billion parameters** and embedding size of **4096**, comparable to OpenChat-3.5-0106 and Mixtral-8x7B. AI chatbots, including **GPT-4**, outperform humans in debates on persuasion. **Mistral-7B** made amusing mistakes on a math riddle. Hardware highlights include a discounted **HGX H100 640GB** machine with 8 H100 GPUs bought for $58k, and CPU comparisons between **Epyc 9374F** and **Threadripper 1950X** for LLM inference. GPU recommendations for local LLMs focus on VRAM and inference speed, with users testing **4090 GPU** and **Midnight-miqu-70b-v1.0.q5_k_s** model. Stable Diffusion influences gaming habits and AI art evaluation shows bias favoring human-labeled art.

Canonical issue URL

So you have time to either:

And congrats to Logan on joining Google.


Table of Contents

[TOC]


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence. Comment crawling still not implemented but coming soon.

Open Source Models and Libraries

Model Performance and Capabilities

Hardware and Performance

Stable Diffusion and Image Generation

Miscellaneous

Memes and Humor

AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

AI Models and Architectures

Retrieval Augmented Generation (RAG)

Tooling and Infrastructure

Research and Techniques

Memes and Humor


AI Discord Recap

A summary of Summaries of Summaries


PART 1: High level Discord summaries

Perplexity AI Discord


LAION Discord

Gecko Climbs to New Heights in Text Embedding: The new Gecko model demonstrates robust performance on the Massive Text Embedding Benchmark (MTEB) and may accelerate diffusion model training, as detailed in its Hugging Face paper and arXiv abstract. Interest in Gecko's practical application is reflected in queries about the availability of its weights.

Aurora-M Lights Up Multilingual LLM Space: The Aurora-M model, with 15.5B parameters, is geared towards multilingual tasks while adhering to guidelines set by the White House EO and is celebrated for processing over 2 trillion training tokens, as highlighted on Twitter and arXiv.

Hugging Face's Diffusers Under the Spotlight: Contributions to Hugging Face's Diffusers stirred debates around efficiency, with a focus on a PR regarding autocast for CUDA in Diffusers and incomplete unification in pipelines, as seen in discussion #551 and PR #7530.

PyTorch Gears Up with 2.6 Stirring Curiosity: Discussions around updates in PyTorch versioning sparked interest, especially regarding the silent addition of bfloat16 support in PyTorch 2.3, and anticipation for new features in the upcoming PyTorch 2.6. Noteworthy contributions include a critique of autocast performance with details in a GitHub thread.

LangChain Event Hooks in AI Engineers With Harrison Chase: Harrison Chase, CEO of LangChain, prepares to talk at an online event about leveraging LangSmith in moving from prototype to production on April 17 at 6:30 PM, with registration available here. His company focuses on using LLMs for context-aware reasoning applications.


Unsloth AI (Daniel Han) Discord

Model Might on a Budget: Guild members actively debated cost vs quality in AI modeling, with discussions ranging from $50K to several million dollars needed for pre-training varying in dataset size. A strong emphasis was placed on finding a balance between resource efficiency and maintaining high-quality outputs.

Scam Shield Tightens: Concerned with an increase in malicious bots and scams, the engineer community underscored the need for robust detection systems to thwart AI misuse and protect Discord servers.

Precision in Saving Space: Tips were shared on conserving space when saving finetuned models on platforms like Google Colab, with one user suggesting a method that saves 8GB of space but warned of a slight loss in accuracy.

Training Tactics Tussle: The optimal approach for division of datasets and the application of sparse fine-tuning (SFT) versus quantization methods was a hot topic, with insights into the trade-offs between performance and cost-effectiveness being highly sought after.

Integration Enthusiasm for DeepSeek: A user-proposed integration of the DeepSeek model into Unslotsh 4bit, showcasing the community's push for model diversity and efficiency improvements, with an accompanying Hugging Face repository and a Google Colab notebook set for implementation.


Stability.ai (Stable Diffusion) Discord

Cyberrealistic vs. EpicRealism XL: Debate is ongoing about the performance of two Stable Diffusion models: while Cyberrealistic demands precise prompts, EpicRealism XL outshines with broader prompt tolerance for realistic imagery.

SD3 Is Coming: The community is buzzing with the 4-6 weeks anticipated release schedule for Stable Diffusion 3 (SD3), with some doubt about the timing but evident excitement for improved features, notably a fixed text function.

Fixing Faces and Hands: The Stable Diffusion aficionados are tackling challenges with rendering facial and hand details, recommending tools such as Adetailer and various embeddings to enhance image quality without sacrificing processing speed.

CHKPT Model Confusion: In the sea of CHKPT models, users seek guidance for best use cases, pointing towards models like ponyxl, dreamshaperxl, juggernautxl, and zavychroma as part of a suggested checkpoint "starter pack" for Stable Diffusion.

Ethics and Performance in Model Development: Discussions touch on the rapid pace of AI development, ethical questions around using professional artwork for AI training, and speculated memory demands for future Stable Diffusion versions, all peppered with light-hearted community banter.


Nous Research AI Discord

DBRX Revealed: A new open-source language model titled DBRX is making waves, claiming top performance on established benchmarks. Watch the introduction of DBRX.

Whisper Models Under the Microscope: WhisperX might replace BetterTransformer given concerns over the latter's high error rates. Community mulling over Transforming the Web and Apple's latest paper on reference resolution.

Speed Meets Precision in LLM Operations: LlamaFile boasts 1.3x - 5x improved speed over llama.cpp on CPU for specific tasks, potentially altering future local operations. A configuration file for Hercules fine-tuning resulted in decreased accuracy, stirring debates over settings like lora_r and lora_alpha.

Hugging Face Misstep Halts Upload: ModelInfo loading issues caused by safetensors.sharded metadata from Hugging Face are preventing uploads to the chain, driving discussions for fixes.

Brainstorming for WorldSim: WorldSim enthusiasts propose a "LLM Coliseum" with competitive benchmarks, file uploads facilitating pre-written scripts, and speculation on future developments like competitive leaderboards and AI battles.

Traffic Signal Dataset Signals Opportunity: A traffic signal image dataset surfaced, promising to aid vision models despite Hugging Face's viewer compatibility issues.


tinygrad (George Hotz) Discord

Trouble in GPU Paradise: AMD GPUs are causing major headaches for tinygrad users, with system crashes and memory leak errors like "amdgpu: failed to allocate BO for amdkfd". Users share workarounds involving PCIe power cycling but remain unimpressed by AMD's perceived lack of commitment to addressing these bugs.

A Virtual Side-Eye to AMD's Program: An invitation to AMD's Vanguard program drew skepticism from George Hotz and others, sparking a debate over the effectiveness of such initiatives and the need for open-source solutions and better software practices at AMD.

Learning Curve for Linear uOps: A detailed write-up explaining linear uops was shared in the #learn-tinygrad channel, aiming to demystify the intermediate representation in tinygrad, complemented by a tutorial on the new command queue following a significant merge.

Tinygrad Pull Requests Under Scrutiny: Pull Request #4034 addressed confusion around unit test code and backend checks. A focus on maintaining proper test environments for various backends like CLANG and OpenCL is emphasized.

Jigsaw Puzzle of Jitted Functions: A knowledge gap regarding why jitted functions don't show up in command queue logs led to discussions about the execution of jitted versus scheduled operations within tinygrad’s infrastructure.


LM Studio Discord

LM Studio Tangles with Model Troubles: Engage with caution, LM Studio is throwing unknown exceptions particularly with estopian maid 13B q4 models on RTX 3060 GPUs, and users report crashes during prolonged inferencing. There's a growing need for Text-to-Speech and Speech-to-Text functionality, but currently, one must tether tools like whisper.cpp for voice capabilities.

In Quest for Localized Privacy: While the quest for privacy in local LLMs continues, one suggestion is to pair LM Studio with AnythingLLM for a confidential setup, though LM Studio itself does not have built-in document support. Meanwhile, Autogen is producing a mere 2 tokens at a time, leaving users to wonder about optimal configurations.

GPU Discussions Heat Up: SLI isn't necessary for multi-GPU setups; however, VRAM rather than combined VRAM is what's at play - an important spec for running models. A dual Tesla P40 setup is touting 3-4 tokens/sec for 70B models, while those on a budget admire P40s' VRAM, weighing it against the prowess of the 4090 GPU.

Top Models for Anonymous Needs: For the discrete engineer, the Nous-Hermes 2 Mistral DPO and Nous-Hermes-2-SOLAR-10.7B models come recommended, particularly for those needing to handle NSFW content. Tech hiccups with model downloads and execution have left some discontented, suspecting missing proxy support as the culprit.

Desiring Previous Generation Functionality: The convenience of splitting text on each new generation is missed, as current LM Studio updates overwrite existing output, prompting requests for a revert to the previous modality.


Eleuther Discord

Google Packs the Web in RAM: Engineers noted Google's robust search performance may be due to embedding the web in RAM using a distributed version of FAISS and refined indexing strategies like inverted indexes. The discussion delved into Google's infrastructure choices, hinting at methods for handling complex and precise search queries.

Sleuthing Google's Programming Paradigms: Participants dissected Google's use of programming strategies that include otherwise shunned constructs like global variables and goto, illustrating a pragmatic approach to problem-solving and efficiency in their systems.

Sparse Autoencoders Reveal Their Secrets: A new visualization library for Sparse Autoencoder (SAE) has been released, shedding light on their feature structures. Mixed reactions to categorizing SAE features in AI models reflect both the detailed complexities and the abstract challenges in AI interpretability.

New Horizons in Music AI: A paper examining GANs and transformers in music composition was discussed, hinting at potential future directions in music AI, including text-to-music conversion metrics. Meanwhile, gaps in lm-eval-harness benchmarks for Anthropic Claude models suggest a growing interest in comprehensive model evaluation frameworks.

Batch Size Trade-offs in GPT-NeoX: Tuning GPT-NeoX for uneven batch sizes may introduce computational bottlenecks due to load imbalances, as larger batches hold up processing speed.

Bonus Bullet for AI Sportsmanship: Suggestions were made for EleutherAI community engagement in the Kaggle AI Mathematical Olympiad competition. Compute grants could support these inclinations towards "AI in science" initiatives.


OpenAI Discord


LlamaIndex Discord


LangChain AI Discord


HuggingFace Discord


Modular (Mojo 🔥) Discord

Mojo Gets Mighty with MAX Engine: The imminent introduction of the MAX Engine and C/C++ interop in Mojo aims to streamline RL Python training, potentially allowing Python environments to be speedily re-implemented in Mojo, as detailed in the Mojo Roadmap. Meanwhile, Mojo 24.2 has excited developers with its focus on Python-friendly features, whose depth is explored in the MAX 24.2 announcement and the blog post on Mojo open-source.

Tune in to Modular's Frequencies: Modular's busy Twitter activity seems part of an outreach or announcement series, and details on their ideas can be tracked on Modular's Twitter for those interested in their updates or campaigns.

Tensors, Tests, and Top-level Code Talk: Open dialogue about the quirks and features of Mojo continued with insights like the need for improved Tensor performance, which was tackled by reducing copy initialization inefficiencies. Engineers also raised issues around top-level code and SIMD implementations, highlighting challenges like Swift-style concurrency and intrinsic function translations, with some guidance available in the Swift Concurrency Manifesto.

Unwrapping the CLI with Prism: The Prism CLI library's overhaul brings new capabilities like shorthand flags and nested command structures, harmonizing with Mojo's 24.2 update. Enhancements include command-specific argument validators, with the development journey and usability of References being a point of focus, as seen on thatstoasty's Prism on GitHub.

Deploy with MAX While Anticipating GPU Support: Questions about using MAX as a Triton backend alternative point to MAX Serving's utility, though currently lacking GPU support; documentation can guide trials via local Docker, found in the MAX Serving docs. Ongoing support and clarifications for prospective MAX adopters are discussed, emphasizing that ONNX models could fit smoothly into the MAX framework.

Nightly Mojo Moves and Documentation: Dedicated Mojo users were alerted about the nightly build updates and directed to use modular update commands, with changes listed in the nightly build changelog. Additionally, valuable guidelines for local Mojo stdlib development and best testing practices are documented, suggesting testing module use over FileCheck and pointing to stdlib development guide.


OpenInterpreter Discord


OpenRouter (Alex Atallah) Discord

Chatbot Prefix Quirk in OpenRouter: The undi95/remm-slerp-l2-13b:extended model is unexpectedly prefixing responses with {bot_name}: in OpenRouter during roleplay chats; however, recent prompt templating changes were ruled out as the cause. The usage of the name field in this scenario is under investigation.

SSL Connection Mystery: A connection attempt to OpenRouter was thwarted by an SSL error described as EOF occurred in violation of protocol, yet the community did not reach a consensus on a solution.

New Book Alert: Architecting with AI in Mind: Obie Fernandez has launched an early release of his book, Patterns of AI-Driven Application Architecture, spotlighting OpenRouter applications. The book is accessible here.

Nitro Model Discussion Heats Up: Despite concerns over the availability of nitro models, it's been affirmed that nitro models are still accessible and forthcoming. Confusion around the performance of different AI models suggests a prominent interest in optimizing speed and efficiency.

Model Troubleshooting & Logit Bias: Users encountered issues with models like NOUS-HERMES-2-MIXTRAL-8X7B-DPO and debated alternatives such as Nous Capybara 34B for specific tasks, noting its 30k context window for improved performance. Clarifications were made regarding OpenRouter logit bias application, which is currently limited to OpenAI's models only.


Mozilla AI Discord


OpenAccess AI Collective (axolotl) Discord

15 Billion Reasons to Consider AMD: MDEL's successful training of a 15B model on AMD GPUs suggests that AMD may be a viable option in the hardware landscape for large-scale AI models.

The Mystery of the Training Freeze: Post-epoch training hangs were reported without the apparent use of val_set_size or eval_table, with hints suggesting the cause could be due to insufficient storage or yet-unidentified bugs in certain models or configurations.

Axolotl Development Continues Amid Pranks: The Axolotl Dev team approved a PR merge for lisa, added a YAML example for testing, and jovially proposed an April Fool's partnership with OpenAI. However, there are issues with missing documentation and out of memory errors potentially related to DeepSpeed or FSDP training attempts.

Unified Data Dilemma: There's a significant effort to combine 15 datasets into a unified format, with members tackling hurdles from data volume to misaligned translations.

Rigorous Runpod Reviews Requested: Interest has been shown in the use of runpod serverless offerings for very large language models, seeking insights from community experiences.


Latent Space Discord

FastLLM Blasts into the AI Scene: Qdrant announced FastLLM (FLLM), a language model boasting a 1 billion token context window for Retrieval Augmented Generation, though skeptics suggest the timing of its announcement on April 1 may signal a jest.

Visualization for Understanding GPTs: A visual introduction to Transformers and GPTs by popular YouTube channel 3Blue1Brown has garnered attention among AI professionals looking for a clearer conceptual understanding of these architectures.

Engineers Build Open Source LLM Answer Engine: An open source "llm-answer-engine" project unveiled on GitHub has intrigued the community with its use of Next.js, Groq, Mixtral, Langchain, and OpenAI to create a Perplexity-Inspired Answer Engine.

Structured Outputs from LLMs Become Simpler: The engineering crowd noted the release of instructor 1.0.0, a tool aimed at ensuring Large Language Models (LLMs) produce structured outputs that conform to user-defined Pydantic models, assisting in seamless integration into broader systems.

Google Powers Up AI Division: In a pivot to bolster its AI offerings, Google has tapped Logan Kilpatrick to lead AI Studio and advance the Gemini API, signaling the tech giant's intensified commitment to becoming the hub for AI developers.


CUDA MODE Discord


Interconnects (Nathan Lambert) Discord


AI21 Labs (Jamba) Discord

Jamba's Speed Insight: Engineers scrutinized how Jamba's end-to-end throughput efficiency improves with more tokens during the decoding process. Some members questioned the increase, given decoding is sequential, but the consensus highlighted that throughput gains exist even as the context size increases, impacting decoding speed.

Decoding Efficiency Puzzler: A pivotal discussion unfolded around a graph showing Jamba's decoding step becoming more efficient with a larger number of tokens. Confusion was addressed, and it was elucidated that the higher throughput per token affects decoding phase efficiency, countering initial misconceptions.


PART 2: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (888 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (19 messages🔥):


Perplexity AI ▷ #pplx-api (16 messages🔥):

Links mentioned:


LAION ▷ #general (525 messages🔥🔥🔥):

Links mentioned:


LAION ▷ #research (9 messages🔥):

Links mentioned:


LAION ▷ #learning-ml (1 messages):

Link mentioned: Meetup #3 LangChain and LLM: Using LangSmith to go from prototype to production, mer. 17 avr. 2024, 18:30 | Meetup: Nous avons le plaisir d'accueillir Harrison Chase, le Co-Founder et CEO de LangChain, pour notre troisième Meetup LangChain and LLM France ! Ne loupez pas cette occasion u


Unsloth AI (Daniel Han) ▷ #general (212 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (311 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (4 messages):

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (377 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #off-topic (4 messages):

Link mentioned: DBRX: A New State-of-the-Art Open LLM: Introducing DBRX, an open, general-purpose LLM created by Databricks. Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established...


Nous Research AI ▷ #interesting-links (18 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (104 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (37 messages🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (4 messages):

Link mentioned: Sayali9141/traffic_signal_images · Datasets at Hugging Face: no description found


Nous Research AI ▷ #bittensor-finetune-subnet (2 messages):


Nous Research AI ▷ #rag-dataset (5 messages):


Nous Research AI ▷ #world-sim (110 messages🔥🔥):

Links mentioned:


tinygrad (George Hotz) ▷ #general (244 messages🔥🔥):

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (31 messages🔥):

Links mentioned:


LM Studio ▷ #💬-general (89 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (44 messages🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (1 messages):


LM Studio ▷ #🎛-hardware-discussion (80 messages🔥🔥):

Links mentioned:


LM Studio ▷ #autogen (1 messages):


Eleuther ▷ #general (54 messages🔥):

Link mentioned: AI Mathematical Olympiad - Progress Prize 1 | Kaggle: no description found


Eleuther ▷ #research (111 messages🔥🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (4 messages):

Links mentioned:


Eleuther ▷ #lm-thunderdome (18 messages🔥):

Links mentioned:


Eleuther ▷ #multimodal-general (3 messages):


Eleuther ▷ #gpt-neox-dev (2 messages):


OpenAI ▷ #annnouncements (1 messages):

Link mentioned: Start using ChatGPT instantly: We’re making it easier for people to experience the benefits of AI without needing to sign up


OpenAI ▷ #ai-discussions (95 messages🔥🔥):

Link mentioned: Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models: With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents, how should LLM evaluations change? Arguably, a key ability of an AI agent is to flexibly co...


OpenAI ▷ #gpt-4-discussions (38 messages🔥):


OpenAI ▷ #prompt-engineering (7 messages):


OpenAI ▷ #api-discussions (7 messages):


LlamaIndex ▷ #announcements (1 messages):

Links mentioned:


LlamaIndex ▷ #blog (4 messages):


LlamaIndex ▷ #general (118 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (4 messages):

Link mentioned: How to build a RAG app using Gemini Pro, LlamaIndex (v0.10+), and Pinecone: Let's talk about building a simple RAG app using LlamaIndex (v0.10+) Pinecone, and Google's Gemini Pro model. A step-by-step tutorial if you're just getting ...


LangChain AI ▷ #general (109 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (5 messages):

Links mentioned:


HuggingFace ▷ #general (79 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):

docphaedrus: https://youtu.be/7na-VCB8gxw?si=azqUL6dGSMCYbgdg


HuggingFace ▷ #cool-finds (5 messages):

Link mentioned: Introducing FastLLM: Qdrant’s Revolutionary LLM - Qdrant: Lightweight and open-source. Custom made for RAG and completely integrated with Qdrant.


HuggingFace ▷ #i-made-this (12 messages🔥):

<ul>
  <li><strong>Stream of Bot Conscience:</strong> Introducing <strong>LLMinator</strong>, a context-aware streaming Chatbot that enables running LLMs locally with Langchain and Gradio, compatible with both CPU and CUDA from HuggingFace. Check it out on <a href="https://github.com/Aesthisia/LLMinator">GitHub</a>.</li>
  <li><strong>Data Management Made Easier:</strong> DagsHub launches a new integration for Colab with DagsHub Storage Buckets, promising a better data management experience akin to a scalable Google Drive for ML. Example notebook is available on <a href="https://colab.research.google.com/#fileId=https%3a%2f%2fdagshub.com%2fDagsHub%2fDagsHubxColab%2fraw%2fmain%2fDagsHub_x_Colab-DagsHub_Storage.ipynb">Google Colab</a>.</li>
  <li><strong>Python's New Rival, Mojo:</strong> Speculations arise about the Mojo Programming Language surpassing Python in performance, as discussed in a YouTube video titled "Mojo Programming Language killed Python." Watch the full explanation <a href="https://youtu.be/vDyonow9iLo">here</a>.</li>
  <li><strong>Robotics Showcase:</strong> A member has built an advanced line follower and wall follower robot with a colour sensor, demonstrated in a YouTube video by SUST_BlackAnt. Find the full presentation <a href="https://www.youtube.com/watch?v=9YmcekQUJPs">here</a>.</li>
  <li><strong>Launch SaaS with OneMix:</strong> The new SaaS boilerplate OneMix claims to accelerate project launches by providing essentials like landing page, payment, and authentication setup. More details are available at <a href="https://saask.ing">saask.ing</a> and a demo on <a href="https://www.youtube.com/watch?v=NUfAtIY85GU&t=8s&ab_channel=AdityaKumarSaroj">YouTube</a>.</li>
</ul>

Links mentioned:


HuggingFace ▷ #reading-group (1 messages):

grimsqueaker: yay! thanks!


HuggingFace ▷ #computer-vision (7 messages):


HuggingFace ▷ #diffusion-discussions (8 messages🔥):

Links mentioned:


HuggingFace ▷ #gradio-announcements (1 messages):


Modular (Mojo 🔥) ▷ #general (8 messages🔥):

Link mentioned: Mojo🔥 roadmap & sharp edges | Modular Docs: A summary of our Mojo plans, including upcoming features and things we need to fix.


Modular (Mojo 🔥) ▷ #💬︱twitter (10 messages🔥):


Modular (Mojo 🔥) ▷ #✍︱blog (1 messages):

Link mentioned: Modular: What’s new in Mojo 24.2: Mojo Nightly, Enhanced Python Interop, OSS stdlib and more: We are building a next-generation AI developer platform for the world. Check out our latest post: What’s new in Mojo 24.2: Mojo Nightly, Enhanced Python Interop, OSS stdlib and more


Modular (Mojo 🔥) ▷ #🔥mojo (47 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-projects (2 messages):

Link mentioned: GitHub - thatstoasty/prism: Mojo CLI Library modeled after Cobra.: Mojo CLI Library modeled after Cobra. Contribute to thatstoasty/prism development by creating an account on GitHub.


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (4 messages):


Modular (Mojo 🔥) ▷ #⚡serving (7 messages):

Link mentioned: Get started with MAX Serving | Modular Docs: A walkthrough showing how to try MAX Serving on your local system.


Modular (Mojo 🔥) ▷ #nightly (11 messages🔥):

Links mentioned:


OpenInterpreter ▷ #general (17 messages🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (45 messages🔥):

Links mentioned:


OpenInterpreter ▷ #ai-content (7 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (66 messages🔥🔥):

Links mentioned:


Mozilla AI ▷ #llamafile (41 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (6 messages):


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (18 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (16 messages🔥):


Latent Space ▷ #ai-general-chat (29 messages🔥):

Links mentioned:


CUDA MODE ▷ #general (4 messages):

Links mentioned:


CUDA MODE ▷ #triton (6 messages):

Link mentioned: Accelerating Triton Dequantization Kernels for GPTQ: TL;DR


CUDA MODE ▷ #cuda (3 messages):


CUDA MODE ▷ #torch (2 messages):

Links mentioned:


CUDA MODE ▷ #off-topic (1 messages):

c_cholesky: Thank u 😊


Interconnects (Nathan Lambert) ▷ #news (1 messages):


Interconnects (Nathan Lambert) ▷ #random (2 messages):

Link mentioned: Tweet from Logan Kilpatrick (@OfficialLoganK): Excited to share I’ve joined @Google to lead product for AI Studio and support the Gemini API. Lots of hard work ahead, but we are going to make Google the best home for developers building with AI. ...


Interconnects (Nathan Lambert) ▷ #rl (5 messages):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #sp2024-history-of-open-alignment (1 messages):


AI21 Labs (Jamba) ▷ #jamba (8 messages🔥):