Frozen AI News archive

Shazeer et al (2024): you are overpaying for inference >13x

**Noam Shazeer** explains how **Character.ai** serves **20% of Google Search Traffic** for LLM inference while reducing serving costs by a factor of **33** compared to late 2022, with leading commercial APIs costing at least **13.5X more**. Key memory-efficiency techniques include **MQA > GQA** reducing KV cache size by 8X, hybrid attention horizons, cross-layer KV-sharing, stateful caching with a 95% cache rate, and native int8 precision with custom kernels. **Anthropic** released **Claude 3.5 Sonnet**, which outperforms **Claude 3 Opus** at twice the speed and one-fifth the cost, passing **64%** of internal pull request tests and introducing new features like Artifacts for real-time doc and code generation. Discussions on LLM architecture highlight the dominance of transformers, challenges in scaling and overfitting, and the importance of architecture work for progress.

Canonical issue URL

AI News for 6/20/2024-6/21/2024. We checked 7 subreddits, 384 Twitters and 30 Discords (415 channels, and 2822 messages) for you. Estimated reading time saved (at 200wpm): 287 minutes. You can now tag @smol_ai for AINews discussions!

In a concise 962 word blogpost, Noam Shazeer returned to writing to explain how Character.ai serves 20% of Google Search Traffic for LLM inference, while reducing serving costs by a factor of 33 (compared to late 2022), estimating that leading commercial APIs would cost at least 13.5X more:

Memory-efficiency: "We use the following techniques to reduce KV cache size by more than 20X without regressing quality. With these techniques, GPU memory is no longer a bottleneck for serving large batch sizes."

  1. MQA > GQA: "reduces KV cache size by 8X compared to the Grouped-Query Attention adopted in most open source models." (Shazeer, 2019)
  2. Hybrid attention horizons: a 1:5 ratio of local (sliding window) attention layers to global (Beltagy et al 2020)
  3. Cross Layer KV-sharing: local attention layers share KV cache with 2-3 neighbors, global layers share cache across blocks. (Brandon et al 2024)

image.png

Stateful Caching: "On Character.AI, the majority of chats are long dialogues; the average message has a dialogue history of 180 messages... To solve this problem, we developed an inter-turn caching system."

  1. Cached KV tensors in a LRU cache with a tree structure (RadixAttention, Zheng et al., 2023) At a fleet level, we use sticky sessions to route the queries from the same dialogue to the same server. Our system achieves a 95% cache rate.
  2. Native int8 precision: as opposed to the more common "post-training quantization". Requiring their own customized int8 kernels for matrix multiplications and attention - with a future post on quantized training promised.

{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

Claude 3.5 Sonnet Release by Anthropic

LLM Architecture and Scaling Discussions

Retrieval, RAG, and Context Length

Benchmarks, Evals, and Safety

Multimodal Models and Vision

Miscellaneous


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

Claude 3.5 Sonnet Release

OpenAI and Competition

Other AI Model Releases and Benchmarks

Stable Diffusion and Image Generation

Memes and Humor


AI Discord Recap

SPECIAL NOTE: As we do for fresh frontier model launches, we are displaying the differences in output from running the same prompts across Claude 3.5 and 3 and GPT4o.

Claude 3 Sonnet

1. LLM Advancements and Benchmarking

2. Optimizing LLM Inference and Training

3. Open-Source AI Frameworks and Community Efforts

4. Multimodal AI and Generative Modeling

Claude 3.5 Sonnet

1. AI Model Releases and Performance Comparisons

2. AI Development Tools and Infrastructure Challenges

3. Ethical Concerns in AI Industry Practices

Claude 3 Opus

1. Model Performance Optimization and Benchmarking

2. Fine-tuning Challenges and Prompt Engineering Strategies

3. Open-Source AI Developments and Collaborations

GPT4O (gpt-4o-2024-05-13)

  1. AI Model Performance and Training Techniques:

    • Gemini 1.5 excels with 1M tokens: Gemini 1.5 Pro impressed users by handling up to 1M tokens effectively, outperforming other models like Claude 3.5 and gaining positive feedback for long-context tasks. This model's ability to process extensive documents and transcripts was highlighted.

    • FP8 Flash Attention and GPTFast speed up inference: Discussions around INT8/FP8 kernels in flash attention and the recently introduced GPTFast indicated significant boosts in HF model inference speeds by up to 9x. Notable mentions included an open-source FP8 flash attention addition, set to receive official CUDA support in 12.5.

    • Null-shot prompting and DPO over RLHF: Community debates touched on the efficacy of null-shot prompting to exploit hallucinations in LLMs and the shift from Reinforcement Learning with Human Feedback (RLHF) to Direct Policy Optimization (DPO) for simplified training. Paper references included the concept's advantages in the LLMs' task performance.

  2. AI Ethics and Accessibility:

    • AI Ethics spark debate: A Nature article criticizing OpenAI's departure from open-source principles stirred discussions on AI transparency and accessibility. Concerns were raised over the increasing difficulty of accessing cutting-edge AI tools and code.

    • Avoiding insincere AI apologies: Users voiced frustration with AI-generated apologies, calling them insincere and unnecessary. This sentiment reflected broader expectations for more authentic and practical AI interactions rather than automated expressions of regret.

    • OpenAI and government collaboration concerns: Concerns mounted over OpenAI's early model access for government entities, highlighted in a tweet. The conversation pointed to potential regulatory implications and strategic shifts towards AGI safety.

  3. Open-Source AI Developments and Community Contributions:

    • Introducing Turbcat 8b: Announcements of the Turbcat 8b model included notable improvements like expanded datasets and added Chinese support. The model now boasts 5GB in data, with comparisons drawn against larger yet underdeveloped models.

    • Axolotl and Backgammon AI Tool: Collaboration efforts highlighted the open-sourced Backgammon AI tool, which simulates scenarios in backgammon for strategic enhancements. Discussions also included the Turbcat model and its functionalities for multilingual processing.

    • Dataset for computer vision from Stability.ai: Stability.ai released a dataset featuring 235,000 prompts and images from the Stable Diffusion community. This StableSemantics dataset aims to augment computer vision systems by providing extensive visual semantics data.

  4. Hardware and Deployment Challenges:

    • GPU usage challenges and optimizations: Engineers shared insights and solutions for optimizing GPU and CPU integrations in different setups, such as enabling the second GPU for LM Studio and discussing alternatives for running sophisticated models. Used 3090s were recommended for cost efficiency, anticipating performance comparisons between NVIDIA 4090 and 5090.

    • TinyGrad's tangles with clip_grad_norm_: Implementing clip_grad_norm_ in TinyGrad faced bottlenecks due to Metal's buffer size limitations, suggesting division into 31-tensor chunks as a workaround. The comparison between Metal and CUDA highlighted performance differences, specifically for gradient clipping operations.

    • Model deployment issues: Deployment challenges with models like Unsloth on platforms like Hugging Face created discussions around tokenizer compatibility and alternative deployment suggestions. Fine-tuning costs also varied dramatically between Together.ai and Unsloth's H100, raising questions about pricing errors.

  5. Event Discussions and Professional Opportunities:

    • Techstars and RecSys Virtual Meetups: Upcoming events like the Techstars Startup Weekend in SF from June 28-30 and the RecSys Learners Virtual Meetup on June 29 were highlighted as opportunities for AI professionals to network, learn, and present innovative ideas. Details and RSVP links were shared for participants' convenience.

    • Job hunting and skill showcasing: Python AI Engineers actively sought job opportunities, emphasizing their skills in NLP and LLMs. Conversations also included insights into companies' support frameworks, like the Modal team's assistance with large models and developer preferences for Slack over Discord.

    • Talks and announcements at AI events: LlamaIndex's founder Jerry Liu's talks at the World's Fair on the future of knowledge assistants were promoted, with mentions of forthcoming special announcements on Twitter.

These discussions provide a comprehensive glance at the innovative, ethical, and practical aspects actively shaping the AI community.


PART 1: High level Discord summaries

OpenAI Discord


CUDA MODE Discord


Stability.ai (Stable Diffusion) Discord


Perplexity AI Discord


HuggingFace Discord

Fancy OCR Work, Florence!: Engineers discussed the unexpected superior OCR capabilities of the Florence-2-base model over its larger counterparts; the findings elicited curiosity and the need for further verification. Surprisingly, the sophisticated model struggled with the seemingly simpler task, indicating a need to measure model capabilities beyond mere scale.

Face-Plant at HuggingFace HQ: Users experienced interruptions with the Hugging Face website, encountering 504 errors and affecting their workflow continuity. This hicriticaluptcy in a critical development resource caused temporary setbacks for users depending on the platform's services.

Helping Hands for AI Projects: Open-source AI projects are seeking collaborative efforts: the Backgammon AI tool aims to simulate backgammon scenarios, while the Nijijourney dataset offers robust benchmarking despite access issues due to its local storage of images.

Play and Contribute: An innovative game, Milton is Trapped, was shared where the objective is to interact with a grumpy AI. Developers are encouraged to contribute to this playful AI endeavor via its GitHub repository.

Ethical Computing Crossroads: An engaging paper highlighting the compromising dialogue between fairness and environmental sustainability in NLP underlines the industry's delicate balancing act. It points out the necessity for a holistic view when advancing AI technologies, where an emphasis on one aspect can inadvertently impair another.


Unsloth AI (Daniel Han) Discord


Nous Research AI Discord


LM Studio Discord

Squeeze More Power from Your GPUs: Engineers found setting the main_gpu value to 1 enables the second GPU for LM Studio on Windows 10 systems. A user was also successful in running vision models solely on CPU by disabling GPU acceleration and using OpenCL, despite slower performance.

Integrating Ollama Models Just Got Easier: For incorporating Ollama models into LM Studio, contributors are shifting from the llamalink project to the updated gollama project, though different presets and flash attention have been proposed to mitigate model gibberish issues.

Advanced Models Challenge Hardware Capabilities: Discussions revealed frustrations with running high-end LLMs on current hardware setups, even with 96GB of VRAM and 256GB of RAM. The community is also exploring used 3090s for cost efficiency and eagerly anticipates performance comparisons between NVIDIA's 4090 and the upcoming 5090.

Optimizing AI Workflows in the Face of Error: In the wake of usability challenges post-beta updates, engineers recommend leveraging nvidia-smi -1 to check for model loads into vRAM, and consider disabling GPU acceleration for stability in Docker environments.

Chroma and langchain Perfect Their Harmony: A BadRequestError with langchain and Chroma's integration was swiftly addressed with a fix in GitHub Issue #21318, proving the community's responsive problem-solving skills in maintaining seamless AI-operated workflows.


Modular (Mojo 🔥) Discord

Pixelated Meetings No More: Community members discussed resolution issues with community meetings streamed on YouTube, noting that while streams can reach 1440p, phone resolutions are often throttled to 360p, possibly due to internet speed restrictions.

MLIR Quest for 256-Bit Integers: In the quest for handling 256-bit operations for cryptography, one user attempted multiple MLIR dialects but faced hurdles, prompting them to consider internal advice, as syntactical support in MLIR or LLVM is not straightforward.

Kapa.ai Bot Glitches With Autocomplete: Users have been experiencing autocompletion inconsistencies with the Kapa.ai bot on Discord, suggesting that manual typing or dropdown selection might be more reliable until the erratic behavior is addressed.

Mojo's Winding Road to Exceptions: Conversations revealed pending implementation of exception handling in Mojo's standard library, with a roadmap document shedding light on future feature rollouts and current limitations (Mojo roadmap & sharp edges).

Navigating Nightly Build Turbulence: The nightly release of the Mojo compiler was disrupted due to branch protection rule changes, but a community member's commit to fix the compiler version mismatch helped stabilize the pipeline, leading to the successful roll-out of the new 2024.6.2115 release, as detailed in the changelog.


OpenRouter (Alex Atallah) Discord


Eleuther Discord


tinygrad (George Hotz) Discord

GPT's Weight Tying Woes: Discussions have surfaced around the proper implementation of weight tying in GPT architectures, noting issues with conflicting methods that cause separate optimization pitfalls and timing out due to a weight initialization method that interferes with weight tying.

TinyGrad's Tangled clip_grad_norm_: Implementing clip_grad_norm_ in TinyGrad is generating performance bottlenecks, predominantly due to Metal's limitations in buffer sizes, suggesting a workaround of dividing the gradients into 31-tensor chunks for optimal efficiency.

Juxtaposing Metal and CUDA: A comparison between Metal and CUDA revealed Metal's inferior handling of tensor operations, specifically gradient clipping. Proposed solutions for Metal involve internal scheduler enhancements to better manage resource constraints.

AMD Device Timeouts in the Hot Seat: Users are experiencing timeouts with AMD devices when running examples like YOLOv3 and ResNet, pointing towards synchronization errors and potential overloads on integrated GPUs such as the Radeon RX Vega 7.

Developer Toolkit Spotlight: A Weights & Biases logging link was shared for insights into TinyGrad's ML performance, showcasing the utility of developer tools in tracking and optimizing machine learning experiments. W&B Log for TinyGrad


LAION Discord


Interconnects (Nathan Lambert) Discord


OpenInterpreter Discord


LLM Finetuning (Hamel + Dan) Discord


LangChain AI Discord

Meet AI Innovators at Techstars SF: Engineers interested in startup development should consider attending the Techstars Startup Weekend from June 28-30 in San Francisco; keynotes and mentorships from industry leaders are on the roster. More about the event can be found here.

Reflexion Tutorial Confounds with Complexity: Concerns were raised about the use of PydanticToolsParser over simpler loops in the Reflexion tutorial, questioning the implications of validation failures — the tutorial can be referenced here.

AI Engineering Talent on the Market: An experienced AI engineer with proficiency in LangChain, OpenAI, and multi-modal LLMs is currently seeking full-time opportunities within the industry.

Streaming Headaches with LangChain: Difficulty streaming LangChain's/LangGraph's messages with Flask to a React application has prompted a user to seek community assistance, but a solution remains elusive.

Innovations and Interactions in AI: Two notable contributions include an article on Retrieval Augmentation with MLX, available here, and the introduction of 'Mark', a CLI tool enhancing the use of markdown with GPT models detailed here.


OpenAccess AI Collective (axolotl) Discord


Cohere Discord


LlamaIndex Discord


Latent Space Discord


AI Stack Devs (Yoko Li) Discord

Spam Strikes in AI Town: Community members have reported a user, <@937822421677912165>, for multiple spam incidents across different channels, prompting calls for moderator intervention. The situation escalated as members expressed frustration, with one stating "wtf what's wrong with u" and encouraging others to report the behavior to Discord.


MLOps @Chipro Discord


Torchtune Discord


The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Datasette - LLM (@SimonW) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

OpenAI ▷ #ai-discussions (597 messages🔥🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (6 messages):

- **Windows Surface laptops suggested but development in Windows not preferred**: A member suggested that the closest hardware and aesthetics to what another user is looking for are the newest Surface laptops but noted, *"have fun developing in windows."* The member hinted at a preference for MacBook for development.
- **Budget-buying advice given**: For a $900 budget, a member suggested a *"refurbished MacBook Air maybe."* Another suggestion was to join a server like *buildapc* for more tailored advice.

OpenAI ▷ #prompt-engineering (11 messages🔥):


OpenAI ▷ #api-discussions (11 messages🔥):


CUDA MODE ▷ #general (9 messages🔥):


CUDA MODE ▷ #torch (6 messages):

Links mentioned:


CUDA MODE ▷ #cool-links (1 messages):


CUDA MODE ▷ #torchao (2 messages):


CUDA MODE ▷ #off-topic (8 messages🔥):

Links mentioned:


CUDA MODE ▷ #llmdotc (341 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #rocm (2 messages):


CUDA MODE ▷ #bitnet (5 messages):

Link mentioned: triton/python/triton/testing.py at main · triton-lang/triton: Development repository for the Triton language and compiler - triton-lang/triton


Stability.ai (Stable Diffusion) ▷ #general-chat (330 messages🔥🔥):

<ul>
  <li><strong>Discord Community Shocked by Removal of Channels</strong>: Members expressed frustration over the deletion of various low-activity channels and archives, with one noting *"the archives were just nice place to see pictures, nothing to moderate anymore"*. There is a sense of loss among users who used the archives for inspiration and community engagement.</li>
  <li><strong>Alternative Stable Diffusion Interfaces Discussed</strong>: Members discussed various interfaces, including ComfyUI, Invoke, and Swarm, with comparisons highlighting each tool's strengths and ease of use. A detailed guide was also shared to help new users get started with these interfaces.</li>
  <li><strong>ComfyUI vs. Other UIs</strong>: There's a debate over the efficiency and popularity of ComfyUI compared to other interfaces like A1111, with some users advocating for the simplicity of node-based workflows and others preferring traditional HTML-based fields.</li>
  <li><strong>Mystery Surrounding Channel Deletion Persists</strong>: <em>Fruit</em> explained the reason behind the channel deletions, stating that *"channels that collect dust...often accumulate bot spam”*. Yet, members remain confused about the necessity of removing the archives and sought clarity on potential restoration.</li>
  <li><strong>New Dataset Announcement</strong>: A dataset of 235,000 prompts and images, collected from the Stable Diffusion Discord, was announced by a member, sharing a link to [StableSemantics](https://arxiv.org/abs/2406.13735v1). This dataset is aimed to aid in understanding the semantics of visual scenes in computer vision.</li>
</ul>

Links mentioned:


Perplexity AI ▷ #general (291 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (7 messages):

Links mentioned:


Perplexity AI ▷ #pplx-api (3 messages):

Link mentioned: Perplexity: Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question.


HuggingFace ▷ #announcements (2 messages):

Links mentioned:


HuggingFace ▷ #general (197 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #cool-finds (5 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (2 messages):

Links mentioned:


HuggingFace ▷ #computer-vision (3 messages):


HuggingFace ▷ #NLP (1 messages):


Unsloth AI (Daniel Han) ▷ #general (134 messages🔥🔥):

- **EOT token confusion in Unsloth**: Members discussed the eos token issue in OpenChat 3.5, where `<|end_of_turn|>` and `</s>` tokens are causing confusion during different stages of training and inference. One said, *"unsloth uses `<|end_of_turn|>`, while llama.cpp uses `<|reserved_special_token_250|>` as the `PAD token`."* 
- **Ollama collaboration**: Discussions highlighted Ollama's compatibility and support with Unsloth. One member mentioned, *"I've just made a live session on Ollama with Daniel and Mike, where we were creating a fine-tuned model etc etc, and it works well."*
- **Null-shot prompting debate**: There was a skeptical discussion about the efficacy of null-shot prompting, with a paper on the topic from Arxiv mentioned. A member sarcastically summarized, *"Sounds like mysticism and praying to the machine spirits."*
- **Dry run suggestion for Unsloth**: A member proposed adding a dry-run feature for Unsloth to view steps before actual training. Another joked, *"are you washing clothes? last I checked, GPUs don't do good with water."*
- **Released YouTube video on emotion detection in AI**: The community was informed about the release of a relevant [YouTube video](https://youtu.be/ZJKglSWgD0w). It covers, "the creation of fine-tuning dataset for LLMs using Unsloth and Ollama."

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (10 messages🔥):

Link mentioned: Together Pricing | The Most Powerful Tools at the Best Value: Get detailed pricing for inference, fine-tuning, training and Together GPU Clusters.


Unsloth AI (Daniel Han) ▷ #help (63 messages🔥🔥):

- **Phi-3-mini shines over Mistral 7b**: *"I've had great luck with phi-3-mini"* as it demonstrated better reasoning consistently compared to Mistral 7b non-instruct. This user used 1k, 10k, and 50k example sets for training and structured the data within training as the only acceptable response.
- **Experimenting with domain adaptation**: *"I need to get other work done, but will check out domain adaptation when I get back to it,"* noted one user, expressing eagerness to explore further after successful training runs.
- **Finetuning model issues**: Numerous users experienced problems with saving and loading fine-tuned models, particularly when using `save_pretrained_merged()`. Recommendations included using simpler save methods and avoiding 16bit quantization, which seemed to cause issues.
- **Debating DPO vs RLHF**: Users discussed switching from Reinforcement Learning with Human Feedback (RLHF) to Direct Policy Optimization (DPO) for ease and efficacy, as DPO is supported by Unsloth and involves simpler training datasets. *"I was thinking to start with RFHF and then check DPO, but now after checking more about DPO, I think i wil switch to DPO first."*
- **Deployment challenges with Hugging Face**: Users shared issues deploying Unsloth-trained models via Hugging Face Inference endpoints due to tokenizer errors. One user sought advice on best deployment platforms using various credits, with responses pending further details.

Links mentioned:


Nous Research AI ▷ #off-topic (2 messages):

Links mentioned:


Nous Research AI ▷ #interesting-links (1 messages):

spencerbot15: https://arxiv.org/abs/2406.14491


Nous Research AI ▷ #announcements (1 messages):

Links mentioned:


Nous Research AI ▷ #general (193 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (3 messages):

Link mentioned: NousResearch/Hermes-2-Theta-Llama-3-70B · Hugging Face: no description found


Nous Research AI ▷ #world-sim (4 messages):

Link mentioned: L'ENTOURLOOP - Lobster Shwarama Ft. Troy Berkley & Khoe Wa (Official Video): "Lobster Shwarama Feat Troy Berkley & Khoe Wa" taken from L'Entourloop "Chickens In Your Town" album, available 👉 https://smarturl.it/LNTRLPChickensIYT♦︎ V...


LM Studio ▷ #💬-general (42 messages🔥):

<ul>
  <li><strong>Fix GPU Utilization in LM Studio</strong>: A member found a solution to use their second GPU in LM Studio by setting the <code>main_gpu</code> value to <code>1</code>. This was helpful for users running multiple GPUs on Windows 10.</li>
  <li><strong>Running Vision Models on CPU</strong>: A member with an older laptop and unsupported AMD GPU successfully ran vision models by disabling GPU acceleration and leveraging OpenCL. However, the operation was notably slower.</li>
  <li><strong>Integrating Ollama Models with LM Studio</strong>: A possible workaround for integrating Ollama models with LM Studio is to use the <a href="https://github.com/sammcj/llamalink">llamalink GitHub project</a>. Another user updated this information by recommending the newer <a href="https://github.com/sammcj/gollama">gollama GitHub project</a>.</li>
  <li><strong>Presets and Model Issues</strong>: Members discussed problems with models generating gibberish and suggested trying different presets. Flash attention was mentioned as a practical fix for these issues.</li>
  <li><strong>Flash Attention Resolves Issues</strong>: Another issue was resolved by enabling flash attention, which normalized the model's responses after encountering issues with query formatting.</li>
</ul>

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (46 messages🔥):

Links mentioned:


LM Studio ▷ #🎛-hardware-discussion (89 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (4 messages):


LM Studio ▷ #langchain (2 messages):

Link mentioned: Local LLM with LM Studio Server: Error in POST payload when using langchain_openai.OpenAIEmbeddings for embedding API. · Issue #21318 · langchain-ai/langchain: Checked other resources I added a very descriptive title to this issue. I searched the LangChain documentation with the integrated search. I used the GitHub search to find a similar question and di...


Modular (Mojo 🔥) ▷ #general (47 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (1 messages):

ModularBot: From Modular: https://twitter.com/Modular/status/1804190052060401850


Modular (Mojo 🔥) ▷ #🔥mojo (4 messages):


Modular (Mojo 🔥) ▷ #📰︱newsletter (1 messages):

Zapier: Modverse Weekly - Issue 37 https://www.modular.com/newsletters/modverse-weekly-37


Modular (Mojo 🔥) ▷ #nightly (17 messages🔥):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

Link mentioned: Anthropic Status: no description found


OpenRouter (Alex Atallah) ▷ #app-showcase (2 messages):

Link mentioned: Not Found | OpenRouter: The page you are looking for does not exist


OpenRouter (Alex Atallah) ▷ #general (63 messages🔥🔥):

Links mentioned:


Eleuther ▷ #general (26 messages🔥):

Link mentioned: Data on the Trajectory of AI: Our public databases catalog over 1300 machine learning models. Explore data and graphs showing the growth and trajectory of AI from 1950 to today.


Eleuther ▷ #research (16 messages🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (1 messages):

There is not enough information in the given message history to create a summary. Please provide more messages or context for an accurate and informative summary.


Eleuther ▷ #lm-thunderdome (5 messages):


tinygrad (George Hotz) ▷ #learn-tinygrad (28 messages🔥):

Link mentioned: chenyuxyz: Weights & Biases, developer tools for machine learning


LAION ▷ #general (10 messages🔥):

Links mentioned:


LAION ▷ #research (13 messages🔥):


LAION ▷ #resources (1 messages):

sajackie: 50$ from steam steamcommunity.com/gift/9178 @everyone


LAION ▷ #learning-ml (1 messages):

sajackie: 50$ from steam steamcommunity.com/gift/9178 @everyone


LAION ▷ #paper-discussion (1 messages):

sajackie: 50$ from steam steamcommunity.com/gift/9178 @everyone


Interconnects (Nathan Lambert) ▷ #news (5 messages):


Interconnects (Nathan Lambert) ▷ #ml-drama (10 messages🔥):


Interconnects (Nathan Lambert) ▷ #random (8 messages🔥):

Link mentioned: Wired: AI startup Perplexity is 'BS machine': Katie Drummond, Wired’s global editorial director, joins 'Squawk Box' to discuss the magazine's investigation into AI search startup Perplexity.


Interconnects (Nathan Lambert) ▷ #posts (1 messages):

natolambert: snail where you at dude wtf


OpenInterpreter ▷ #general (19 messages🔥):

Links mentioned:


OpenInterpreter ▷ #ai-content (1 messages):

Link mentioned: Tweet from killian (@hellokillian): i showed a fully local, computer-controlling AI a sticky note with my wifi password. it got online.


LLM Finetuning (Hamel + Dan) ▷ #general (12 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #🟩-modal (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #replicate (1 messages):

4.8.15.16.23.42_: I believe they mentioned somewhere - a year


LLM Finetuning (Hamel + Dan) ▷ #allaire_inspect_ai (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #east-coast-usa (1 messages):

stevenmerrill: Similar question: anyone in Greater Boston?


LLM Finetuning (Hamel + Dan) ▷ #predibase (2 messages):


LLM Finetuning (Hamel + Dan) ▷ #openpipe (1 messages):

abhishek_54517: Seems to be 1 year


LangChain AI ▷ #general (14 messages🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (4 messages):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (8 messages🔥):

Link mentioned: turboderp/llama3-turbcat-instruct-8b · Hugging Face: no description found


OpenAccess AI Collective (axolotl) ▷ #general-help (7 messages):


OpenAccess AI Collective (axolotl) ▷ #datasets (1 messages):

ben44: Moved to <#1110594519226925137>


Cohere ▷ #general (15 messages🔥):

Link mentioned: Use Your Potions and Scrolls: I find that when I play RPG games, I often hoard single-use items like potions and scrolls, saving them for some future critical moment. I finish games like Skyrim with a backpack full of unspent res...


LlamaIndex ▷ #blog (1 messages):


LlamaIndex ▷ #general (13 messages🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (12 messages🔥):

Links mentioned:


AI Stack Devs (Yoko Li) ▷ #ai-town-discuss (2 messages):

- **Spam Alert Initiated**: A message highlights that user <@937822421677912165> is once again involved in spam activity. The user is tagged to alert moderators for intervention.
- **Repeated Spam by Same User**: Another alert calls out the same user, <@937822421677912165>, for repetitive spam occurrences. Moderators are being summoned to handle the situation.

AI Stack Devs (Yoko Li) ▷ #assets (3 messages):


MLOps @Chipro ▷ #events (4 messages):

Link mentioned: RecSys Learners Virtual Meetup · Luma: Join us for an exciting and informative RecSys Learner Virtual Meetup, designed for enthusiasts and professionals passionate about Recommendation Systems. This…


Torchtune ▷ #general (2 messages):







{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}