Frozen AI News archive

Kolmogorov-Arnold Networks: MLP killers or just spicy MLPs?

**Ziming Liu**, a grad student of **Max Tegmark**, published a paper on **Kolmogorov-Arnold Networks (KANs)**, claiming they outperform **MLPs** in interpretability, inductive bias injection, function approximation accuracy, and scaling, despite being 10x slower to train but 100x more parameter efficient. KANs use learnable activation functions modeled by B-splines on edges rather than fixed activations on nodes. However, it was later shown that KANs can be mathematically rearranged back into MLPs with similar parameter counts, sparking debate on their interpretability and novelty. Meanwhile, on AI Twitter, there is speculation about a potential **GPT-5** release with mixed impressions, OpenAI's adoption of the **C2PA metadata standard** for detecting AI-generated images with high accuracy for **DALL-E 3**, and **Microsoft** training a large 500B parameter model called **MAI-1**, potentially previewed at Build conference, signaling increased competition with OpenAI. *"OpenAI's safety testing for GPT-4.5 couldn't finish in time for Google I/O launch"* was also noted.

Canonical issue URL

AI News for 5/6/2024-5/7/2024. We checked 7 subreddits and 373 Twitters and 28 Discords (419 channels, and 3749 messages) for you. Estimated reading time saved (at 200wpm): 414 minutes.

Theory papers are usually above our paygrade, but that is enough drama and not enough else going on today that we have the space to write about it. A week ago, Max Tegmark's grad student Ziming Liu published his very well written paper on KANs (complete with fully documented library), claiming them as almost universally equal to or superior to MLPs on many important dimensions like interpretability/inductive bias injection, function approximation accuracy and scaling (though is acknowledged to be currently 10x slower to train on current hardware on a same-param count basis, it is also 100x more param efficient).

image.png

While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights").

Instead of layering preset activations like ReLu, KANs model "learnable activation functions" using B-splines (aka no linear weights, just curves) and simple addition. People got excited, rewriting GPTs with KANs.

One week on, it now turns out that you can rearrange the KAN terms to arrive back at MLPs with the ~same number of params (twitter):

image.png

It doesn't surprise that you can rewrite one universal approximator as another - but following this very simple publication, many are defending KANs as more interpretable... which is also being rightfully challenged.

Have we seen the full rise and fall of a new theory paper in a single week? Is this the preprint system working?


Table of Contents

[TOC]


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

OpenAI and GPT Models

Microsoft AI Developments

Other LLM Developments

AI Benchmarks and Evaluations

Scaling Laws and Architectures


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

AI Progress and Capabilities

AI Ethics and Societal Impact

Technical Developments

Stable Diffusion and Image Generation

Humor and Memes


AI Discord Recap

A summary of Summaries of Summaries

1. Model Performance Optimization and Benchmarking

2. Fine-tuning Challenges and Prompt Engineering Strategies

3. Open-Source AI Developments and Collaborations

4. Hardware Considerations for Efficient AI Workloads

5. Misc


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Nous Research AI Discord

Function Calling Face-off: Llama 3 70b shows better function calling performance over Mistral 8x22b, revealing a gap despite the latter's touted capabilities, exemplified by the members' discussion around the utility and accuracy of function calling in AI chatbots.

A Battle of Speeds in AI Training: Comparing training times leads to concerns, with reports of 500 seconds per step on an A100 for LoRA llama 3 8b tuning and just 3 minutes for 1,000 iterations for Llama2 7B using litgpt, showing wide variances in efficiency and raising questions on optimization and practices.

Impatience for Improvements: Users express disappointment over inaccessible features such as worldsim.nousresearch.com, and latency in critical updates for networks like Bittensor, highlighting real-time challenges faced by developers in AI and the ripple effects of stalled updates on productivity.

Quantization Leaps Forward: The AQLM project advances with models like Llama-3-70b and Command-R+, demonstrating progress with running Large Language Models on individual GPUs and touching upon the community's push for greater model accessibility and performance.

Chasing Trustworthy AI: Invetech's "Deterministic Quoting" to combat hallucinations indicates a strong community desire for reliable AI, particularly in sensitive sectors like healthcare, aiming to marry veracity with the innovative potential of Large Language Models as seen in the discussion.


Stability.ai (Stable Diffusion) Discord


LM Studio Discord

Private Life for Your Code: Users call for a server logging off feature in LM Studio for privacy during development, with genuine concerns about server logs being collected through the GUI.

A Day in the CLI: There's interest in using LM Studio in headless mode and leveraging the lms CLI to start servers via the command line. Users also shared updates on tokenizer complications for Command R and Command R+ after a llama toolkit update and issued guidance for downloading updated quantizations from Hugging Face Co's Model Repo.

Memory Lapses in Linux: A peculiar case of Linux misreporting memory in LM Studio version 0.2.22 stirred some discussions, with suggestions offered to resolve GPU offloading troubles for running models like Meta Llama 3 instruct 7B.

Prompts Lost in Thought: Users tackled issues around LM Studio erroneously responding to deleted content and scoped document access, sparking a debate about LLMs' handling and retention of data.

Model Malfunctions: Troubles with several models in LM Studio were flagged, including llava-phi-3-mini misrecognizing images and models like Mixtral and Wizard LM fumbling Dungeon & Dragons data persistence despite AnythingLLM database use.

Power-play Considerations: Hardware aficionados in the guild grapple with GPU power consumption, server motherboards, and PCIe bandwidth, sharing successful runs of LM Studio in VMs with virtual GPUs and weighing in on practical hardware setups for AI endeavors.

Beta-testing Blues: Discussions mentioned crashes in 7B models on 8GB GPUs and unloading issues post-crash, with beta users seeking solutions for recurring errors.

SDK Advent: Announcement of new lmstudiojs SDK signals upcoming langchain integrations for more streamlined tool development.

In the AI Trenches: Users provided a solution for dependency package installation on Linux, discussed LM Studio's compatibility on Ubuntu 22.04 vs. 24.04, and shared challenges with LM Studio's API integration and concurrent request handling.

Engineer Inquiry: Curiosity peaked about GPT-Engineer setup with LM Studio and whether it involved custom prompting techniques.

Prompting the AIs: Some voiced the value of prompt engineering as a craft, citing it as central to garnering premium outputs from LLMs and sharing a win in Singapore’s GPT-4 Prompt Engineering Competition covered in Towards Data Science.

AutoGen Hiccups: There's a brief mention of a bug causing AutoGen Studio to send incomplete messages, with no further discussion on the resolution or cause.


HuggingFace Discord

ASR Fine-Tuning Takes Center Stage: Engineers discussed enhancing the openai/whisper-small ASR model, emphasizing dataset size and hyperparameter tuning. Tips included adjusting weight_decay and learning_rate to improve training, highlighted by community-shared resources on hyperparameters like gradient accumulation steps and learning rate adjustments.

Deep Dive into Quantum and AI Tools: Stealthy interest in seemingly nascent quantum virtual servers surfaced with Oqtant, while the AI toolkit included everything from an all-in-one assistant everything-ai capable of 50+ language support to the spaghetti-coded image-generating discord bot Sparky 2.

Debugging and Datasets: Chatbots designing PowerPoint slides, XLM-R getting a Flash Attention 2 upgrade, and multi-label image classification training woes took the stage, connecting community members across problems and sharing valuable insights. Meanwhile, the lost UA-DETRAC dataset incited a search for its much-needed annotations for traffic camera-based object detection.

Customization and Challenges in Model Training: From personalizing image models with Custom Diffusion—requiring minimal example images—to the struggles with fine-tuning Stable Diffusion 1.5 and BERT models, the community wrestled with and brainstormed solutions for various training hiccups. Device mismatches during multi-GPU and CPU offloading and the importance of optimization techniques for restricted resources were notable pain points.

Novel Approaches in Teaching Retrieval to LLMs: A newer technique encouraging LLMs to use a <RET> token for information retrieval to boost performance was discussed with reference to a recent paper, highlighting the importance of this method for elusive questions that evade the model's memory. This sits alongside observations on model billing methods via token counts, with practical insights shared on pricing strategies.


Perplexity AI Discord

Beta Bewilderment: Users experienced confusion with accessing Perplexity AI's beta version; one assumed clicking an icon would reveal a form, which didn't happen, and it was clarified that the beta is closed.

Performance Puzzles: Across different devices, Perplexity AI users reported technical issues such as unresponsive buttons and sluggish loading. Conversations revolved around limits of models like Claude 3 Opus and Sonar 32k, effecting work, with calls to check Perplexity's FAQ for details.

AI Model Melee: Comparisons of AI models' capabilities, including GPT-4 Turbo, Sonar, and Opus, were discussed, focusing on tasks like essay writing and code refactoring. Clarity was sought on whether source limits in searches had increased, with GIFs used to illustrate responses.

API Angst and Insights: Discussions in the Perplexity API channel ranged from crafting JSON outputs to perplexities with the search features of Perplexity's online models. The documentation was updated (as highlighted in a link to docs), important for users dealing with issues like outdated search results and exploring model parameter counts.

Shared Discoveries through Perplexity: The community delved into Perplexity AI's offerings, addressing an array of topics from US Air Force insights to Microsoft's 500 billion parameter AI model. Users shared an aspiration for a standardized image creation UI along with links to features like Insanity by XDream and emphasized content's shareability.


CUDA MODE Discord

GPU Clock Speed Mix-Up: A conversation was sparked by confusion over the clock speed of H100 GPUs, with the initial statement of 1.8 MHz corrected to 1.8 GHz. This highlighted the need to distinguish MHz from GHz and the importance of accurate specifications in discussions on GPU performance.

Tuning CUDA: From Kernels to Libraries: Members shared insights on optimizing CUDA operations, emphasizing the efficiency of Triton in kernel design, the advantage of fused operations in element-wise computations, and the use of CUDA's Thrust library. A CUDA best practice is to use Thrust's for_each and transform for near-bandwidth-saturating performance.

PyTorch Dynamics: Various issues and improvements in PyTorch were discussed, including troubleshooting dynamic shapes with PyTorch Compile using TORCH_LOGS="+dynamic" and how to work with torch.compile for the Triton backend. An issue reported on PyTorch's GitHub relates to combining Compile with DDP & dynamic shapes, captured in pytorch/pytorch #125641.

Transformer Performance Innovations: Conversations revolved around techniques to boost the efficiency of transformers, with the introduction of Dynamic Memory Compression (DMC) by a community member, potentially improving throughput by up to 370% on H100 GPUs. Members also discussed whether quantization was involved in this method, with reference to the paper on the technique.

CUDA Discussions Heat Up in llm.c: The llm.c channel was bustling with activity, addressing issues such as multi-GPU training hangs on the master branch and optimization opportunities using NVIDIA Nsight™ Systems. A notable contribution is HuggingFace's release of the FineWeb dataset for LLM performance, documented in PR #369, with potential kernel optimizations for performance gains discussed in PR #307.


OpenAI Discord


Eleuther Discord


Modular (Mojo 🔥) Discord


OpenRouter (Alex Atallah) Discord


OpenInterpreter Discord

Python 3.10 Spells Success: Open Interpreter (OI) should be run with Python 3.10 to avoid compatibility issues; one user improved performance by switching to models like dolphin or mixtral. The GitHub repository for Open Interpreter was suggested for insights on skill persistence.

Conda Environments Save the Day: Engineers recommended using a Conda environment for a conflict-free installation of Open Interpreter on Mac, specifically with Python 3.10 to sidestep version clashes and related errors.

Jan Framework Enjoys Local Support: Jan can be utilized as a local model framework for the O1 device without hiccups, contingent on similar model serving methods as with Open Interpreter.

Globetrotters Inquire About O1: The 01 device works globally, but hosted services are assumed to be US-centric for now, with no international shipments confirmed.

Fine-Tuning Frustrations and Fixes: A call to understand and employ system messages effectively before fine-tuning models led to the suggestion of OpenPipe.ai, as members navigate optimal performance for various models with Open Interpreter. The conversation included benchmarking models and the poor performance of Phi-3-Mini-128k-Instruct when used with OI.


OpenAccess AI Collective (axolotl) Discord

Open Source Magic on the Rise: The community launched an open-source alternative to Sora, named StoryDiffusion, released under an MIT license on Github; its weights, however, are still pending release.

Memory Efficiency Through Unsloth Checkpointing: Implementing unsloth gradient checkpointing has led to a reported reduction in VRAM usage from 19,712MB to 17,427MB, highlighting Unsloth's effectiveness in memory optimization.

Speculations on Lazy Model Layers: An oddity was observed where only specific slices of model layers were being trained, contrasting the full layer training seen in other models; theories posited include models potentially optimizing mainly the first and last layers when confronted with too easy datasets.

Prompt Design Proves Pivotal: AI enthusiasts emphasized that prompt design, particularly regarding the use of suitable templates and end-of-text tokens, is critical in influencing model performance during both fine-tuning and evaluation.

Expanded Axolotl Docs Unveil Weight Merging Insights: A new update to Axolotl documentation has been rolled out, enhancing insights on merging model weights, with an emphasis on extending these guidelines to cover inference strategies, as seen on the Continuum Training Platform.


LangChain AI Discord


LAION Discord


LlamaIndex Discord


tinygrad (George Hotz) Discord


Cohere Discord

SQL Database Harbor Found: The SQL database needed for tracking conversational history in the Cohere toolkit is set to operate on port 5432, but a precise location was not mentioned.

Google Bard Rivalry, School Edition: A high school student planning to create a Bard-like chatbot received guidance from Cohere about adhering to user agreements with the caveat of obtaining a production key, as elaborated in Cohere's documentation.

Chroma Hiccups Amidst Local Testing: There's an unresolved IndexError when using Cohere toolkit's Chroma for document retrieval, with a full log trace available at Pastebin and a recommendation to use the latest prebuilt container.

Retriever Confusion in Cohere Toolkit: An anomaly was observed where Langchain retriever was selected by default despite an alternative being specified, as per a user report – though the screenshot provided to evidence this was not viewable.

Production Key Puzzle: A user faced an odd situation where a new production key behaved like a trial key in the Cohere toolkit. However, Cohere support clarified that it is expected behavior in Playground / Chat UI and correct functionality should prevail when used in the API.

Coral Melds Chatbot and ReRank Skills: Introducing Coral Chatbot, which merges capabilities like text generation, summarization, and ReRank into a unified tool available for feedback on its Streamlit page.

Python Decorators, a Quick Byte: A brief explainer titled "Python Decorators In 1 MINUTE" was shared for those seeking an expedited introduction to this pythonic concept - the video is accessible on YouTube.


Latent Space Discord


AI Stack Devs (Yoko Li) Discord


Mozilla AI Discord


Interconnects (Nathan Lambert) Discord

AI Benchmarks in Spotlight: Dr. Jim Fan's tweet spurred a debate on the overvaluation of specific benchmarks and public democracy in AI evaluation, and the member suggested AB testing as a more effective approach.

Benchmarking Across Industries: Drawing parallels to the database sector, one engineer underscored the significance of having standard benchmarks for AI, referencing the approach mentioned in Dr. Fan's tweet.

TPC Standards Explained: In response to inquiries, a member clarified TPC as the Transaction Processing Council, which standardizes database industry benchmarks, referencing specific benchmarks such as TPC-C and TPC-H.

GPT-2's Surprising Comeback: A light-hearted mention by Sam Altman prompted discussion about GPT-2’s return to the LMsys arena, with a tweet snapshot shared showing the humor involved.

Lingering Doubts Over LMsys Direction: Nathan Lambert voiced skepticism towards OpenAI possibly using LMsys for model evaluations and expressed concern about LMsys's resource limitations and potential reputation damage from the latest 'chatgpt2-chatbot' hype.


DiscoResearch Discord


LLM Perf Enthusiasts AI Discord

Anthropic AI's Prompt Tool Piques Interest: Engineers found a new prompt generator tool in the Anthropic console, sparking discussions on its potential and capabilities.

Politeness through AI Crafted: The tool demonstrated its value by successfully rephrasing statements more courteously, marking a thumbs-up for practical AI usage.

Unpacking the AI's Instruction Set: An engineer embarked on uncovering the tool's system prompt, specifically noting the heavy reliance on k-shot examples in its architecture.

Extracting the Full AI Prompt Faces Challenges: Despite hurdles in retrieving the complete prompt due to its considerable size, the enthusiasm in the discussions remained high.

Share and Care Amongst AI Aficionados: A pledge was made by a community member to share the fully extracted prompt with peers, ensuring collective progress in understanding and utilizing the new tool.


Alignment Lab AI Discord

Given the information provided, there is no relevant discussion content to summarize for an AI Engineer audience. If future discussions include technical, detail-oriented content, a summary appropriate for engineers can be generated.


Datasette - LLM (@SimonW) Discord


The Skunkworks AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (170 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (86 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (412 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (23 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (3 messages):

Link mentioned: moondream/notebooks/Finetuning.ipynb at main · vikhyat/moondream: tiny vision language model. Contribute to vikhyat/moondream development by creating an account on GitHub.


Nous Research AI ▷ #ctx-length-research (2 messages):


Nous Research AI ▷ #off-topic (6 messages):

Link mentioned: Recipic Demo: Ever felt confused about what to make for dinner or lunch? What if there was a website where you could just upload what ingredients you have and get recipes ...


Nous Research AI ▷ #interesting-links (7 messages):

Links mentioned:


Nous Research AI ▷ #general (527 messages🔥🔥🔥):

<ul>
  <li><strong>AI Chatbot Comparison and Speculation</strong>: Members discussed the performance of various AI models, with particular focus on function calling capabilities. **Llama 3 70b** was deemed superior to **Mistral 8x22b** for function calling, despite the latter's "superior function calling" marketing.</li>
  <li><strong>The Return of GPT-2 in LMSYS</strong>: There's buzz around the return of **GPT-2** to LMSYS with significant improvements, and speculation on whether it's a new model being A/B tested or something else, such as GPT-4Lite or a more cost-efficient GPT alternative.</li>
  <li><strong>Testing of the Hermes 2 Pro Llama 3 8B Model</strong>: A member requested testing of the **Hermes 2 Pro Llama 3 8B** model's function calling ability up to the 32k token limit, but practical limitations due to time and resource constraints were mentioned.</li>
  <li><strong>Chatbot Names, Open Source Hopes, and GPT Hype Debates</strong>: The unique naming of chatbot models (like GPT-2 chatbot) led to discussions and jokes about their capabilities and the potential for an OpenAI model becoming open source. There were both skepticism and anticipation regarding the next big AI development and its release timeline.</li>
  <li><strong>YAML vs. JSON in Model Input</strong>: A brief mention was made on the preference for YAML over JSON for model inputs due to better human readability and token efficiency.</li>
</ul>

Links mentioned:


Nous Research AI ▷ #ask-about-llms (12 messages🔥):


Nous Research AI ▷ #bittensor-finetune-subnet (11 messages🔥):


Nous Research AI ▷ #world-sim (7 messages):

Link mentioned: worldsim: no description found


Stability.ai (Stable Diffusion) ▷ #general-chat (421 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #💬-general (107 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (21 messages🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (11 messages🔥):


LM Studio ▷ #📝-prompts-discussion-chat (8 messages🔥):


LM Studio ▷ #⚙-configs-discussion (9 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (25 messages🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (28 messages🔥):


LM Studio ▷ #autogen (2 messages):


LM Studio ▷ #langchain (2 messages):


LM Studio ▷ #crew-ai (1 messages):


LM Studio ▷ #🛠-dev-chat (41 messages🔥):


HuggingFace ▷ #general (164 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (6 messages):

Links mentioned:


HuggingFace ▷ #cool-finds (12 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (8 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (3 messages):

Links mentioned:


HuggingFace ▷ #computer-vision (15 messages🔥):

Link mentioned: Transfer Learning and Fine-tuning Vision Transformers for Image Classification - Hugging Face Community Computer Vision Course: no description found


HuggingFace ▷ #NLP (9 messages🔥):


HuggingFace ▷ #diffusion-discussions (7 messages):

Links mentioned:


Perplexity AI ▷ #general (168 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (19 messages🔥):


Perplexity AI ▷ #pplx-api (23 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (13 messages🔥):


CUDA MODE ▷ #cuda (20 messages🔥):

Links mentioned:


CUDA MODE ▷ #torch (2 messages):

Link mentioned: Issues · pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration - Issues · pytorch/pytorch


CUDA MODE ▷ #algorithms (2 messages):

Link mentioned: Tweet from Piotr Nawrot (@p_nawrot): The memory in Transformers grows linearly with the sequence length at inference time. In SSMs it is constant, but often at the expense of performance. We introduce Dynamic Memory Compression (DMC) w...


CUDA MODE ▷ #beginner (7 messages):


CUDA MODE ▷ #pmpp-book (4 messages):

Link mentioned: An Efficient Matrix Transpose in CUDA C/C++ | NVIDIA Technical Blog: My last CUDA C++ post covered the mechanics of using shared memory, including static and dynamic allocation. In this post I will show some of the performance gains achievable using shared memory.


CUDA MODE ▷ #youtube-recordings (4 messages):


CUDA MODE ▷ #jax (1 messages):

Link mentioned: Multi chip performance in JAX: The larger the models we use get the more it becomes necessary to be able to perform training of machine learning models over multiple chips. In this blog post we will explain how to efficiently use G...


CUDA MODE ▷ #off-topic (8 messages🔥):

Link mentioned: GitHub - openai/triton: Development repository for the Triton language and compiler: Development repository for the Triton language and compiler - openai/triton


CUDA MODE ▷ #irl-meetup (1 messages):

glaxus_: Anyone going to be at MLSys?


CUDA MODE ▷ #llmdotc (133 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #oneapi (4 messages):

Links mentioned:

  PyTorch

: no description foundAdd accelerators to quick start table by aradys · Pull Request #1596 · pytorch/pytorch.github.io: Create accelerators dropdown with following options and add it to quick start table: Huawei Ascend Intel Extension for PyTorch Intel Gaudi Add commands to previous versions section RFC: pytorc...


OpenAI ▷ #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (87 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (15 messages🔥):


OpenAI ▷ #prompt-engineering (33 messages🔥):


OpenAI ▷ #api-discussions (33 messages🔥):


Eleuther ▷ #general (39 messages🔥):


Eleuther ▷ #research (77 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (1 messages):

nullonesix: https://arxiv.org/abs/2102.01293


Eleuther ▷ #interpretability-general (34 messages🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (11 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (39 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):


Modular (Mojo 🔥) ▷ #📺︱youtube (1 messages):

Link mentioned: Modular Community Livestream - New in MAX 24.3: MAX 24.3 is now available! Join us on our upcoming livestream as we discuss what’s new in MAX Engine and Mojo🔥 - preview of MAX Engine Extensibility API for...


Modular (Mojo 🔥) ▷ #🔥mojo (53 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-projects (14 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #nightly (16 messages🔥):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Link mentioned: Lynn: Llama 3 Soliloquy 8B v2 by lynn | OpenRouter: Soliloquy-L3 v2 is a fast, highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base,...


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

Link mentioned: Rubik's AI - AI research assistant & Search Engine: no description found


OpenRouter (Alex Atallah) ▷ #general (119 messages🔥🔥):

Links mentioned:


OpenInterpreter ▷ #general (39 messages🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (80 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (35 messages🔥):

Link mentioned: GitHub - HVision-NKU/StoryDiffusion: Create Magic Story!: Create Magic Story! Contribute to HVision-NKU/StoryDiffusion development by creating an account on GitHub.


OpenAccess AI Collective (axolotl) ▷ #other-llms (1 messages):

icecream102: Coincidence?


OpenAccess AI Collective (axolotl) ▷ #general-help (5 messages):


OpenAccess AI Collective (axolotl) ▷ #datasets (38 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #docs (2 messages):

Link mentioned: Introduction | Continuum Training Platform | Axolotl Training Platform: no description found


OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (5 messages):

Link mentioned: OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.


LangChain AI ▷ #general (43 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (13 messages🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (5 messages):

Links mentioned:


LangChain AI ▷ #tutorials (1 messages):

mhadi91: https://youtu.be/WTfWgYsIspE?si=gEdyMrX4vJm2gC6E


LAION ▷ #general (61 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #announcements (1 messages):

Link mentioned: LlamaIndex Webinar: Build Open-Source Coding Assistant with OpenDevin · Zoom · Luma: OpenDevin is a fully open-source version of Devin from Cognition - an autonomous AI engineer able to autonomously execute complex engineering tasks and…


LlamaIndex ▷ #blog (4 messages):

Link mentioned: The AI Quality Conference: The world's first AI Quality Conference on June 25, 2024 in San Francisco, CA


LlamaIndex ▷ #general (50 messages🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (4 messages):

Links mentioned:


tinygrad (George Hotz) ▷ #general (35 messages🔥):


tinygrad (George Hotz) ▷ #learn-tinygrad (20 messages🔥):

Links mentioned:


Cohere ▷ #general (35 messages🔥):

Links mentioned:


Cohere ▷ #project-sharing (2 messages):

Links mentioned:


Latent Space ▷ #ai-general-chat (35 messages🔥):

Links mentioned:


AI Stack Devs (Yoko Li) ▷ #ai-companion (6 messages):


AI Stack Devs (Yoko Li) ▷ #team-up (5 messages):


AI Stack Devs (Yoko Li) ▷ #ai-town-dev (11 messages🔥):

Link mentioned: llama farm: no description found


AI Stack Devs (Yoko Li) ▷ #paper-spam (1 messages):

Deforum Daily Papers: Papers will now be sent to <#1227492197541220394>


Mozilla AI ▷ #llamafile (18 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (5 messages):


Interconnects (Nathan Lambert) ▷ #random (11 messages🔥):

Link mentioned: Tweet from ハードはんぺん (@U8JDq51Thjo1IHM): I’m-also-a-good-gpt2-chatbot I’m-a-good-gpt2-chatbot ?? Quoting Jimmy Apples 🍎/acc (@apples_jimmy) @sama funny guy arnt you. Gpt2 back on lmsys arena.


DiscoResearch ▷ #mixtral_implementation (2 messages):


DiscoResearch ▷ #general (3 messages):


DiscoResearch ▷ #discolm_german (10 messages🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #prompting (7 messages):


Alignment Lab AI ▷ #general-chat (2 messages):

Since the provided messages are only greetings, there is no substantive content to summarize in the requested format. If more topical and detailed messages are provided, I would be able to create a summary based on those.


Datasette - LLM (@SimonW) ▷ #llm (2 messages):

Link mentioned: Design and implement parameterization mechanism · Issue #4 · simonw/llm-evals-plugin: Initial thoughts here: #1 (comment) I want a parameterization mechanism, so you can run the same eval against multiple examples at once. Those examples can be stored directly in the YAML or can be ...