Frozen AI News archive

1 TRILLION token context, real time, on device?

**Cartesia**, a startup specializing in **state space models (SSMs)**, launched a low latency voice model outperforming transformer-based models with **20% lower perplexity**, **2x lower word error**, and **1 point higher NISQA quality**. This breakthrough highlights the potential for models that can continuously process and reason over massive streams of multimodal data (text, audio, video) with a **trillion token context window** on-device. The news also covers recent AI developments including **Mistral's Codestral weights release**, **Schedule Free optimizers** paper release, and **Scale AI's** new elo-style eval leaderboards. Additionally, a debate between **yann-lecun** and **elon-musk** on the importance of publishing AI research versus engineering achievements was noted. The **Gemini 1.5 Pro/Advanced** models were mentioned for their strong performance.

Canonical issue URL

AI News for 5/28/2024-5/29/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (389 channels, and 5432 messages) for you. Estimated reading time saved (at 200wpm): 553 minutes.

Our prior candidates for today's headline story:

But today we give the W to Cartesia, the State Space Models startup founded by the other Mamba coauthor who launched their rumored low latency voice model today, handily beating its Transformer equivalent (20% lower perplexity, 2x lower word error, 1 point higher NISQA quality):

image.png

evidenced by a yawning gap in loss charts:

image.png

This is the most recent in a growing crop of usable state space models, and the launch post discusses the vision unlocked by extremely efficient realtime models:

Not even the best models can continuously process and reason over a year-long stream of audio, video and text: 1B text tokens, 10B audio tokens and 1T video tokens —let alone do this on-device. Shouldn't everyone have access to cheap intelligence that doesn't require marshaling a data center?

as well as a preview of what super fast on-device TTS looks like.

It is highly encouraging to see usable SSMs in the wild now, feasibly challenging SOTA (we haven't yet seen any comparisons with ElevenLabs et al, but spot checks on the Cartesia Playground were very convincing to our ears as experienced ElevenLabs users).

But comparing SSMs with current SOTA misses the sheer ambition of what is mentioned in the quoted text above: what would you do differently if you KNEW that we may soon have models can that continuously process and reason over text/audio/video with a TRILLION token "context window"? On device?


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

Yann LeCun and Elon Musk Debate on AI Research and Engineering

Advancements in Large Language Models (LLMs) and AI Capabilities

Research Papers and Techniques

Memes and Humor


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!

AI Model Development

AI Safety & Ethics

AI Tools & Applications

AI Hardware

AI Drama & Controversy

Memes & Humor


AI Discord Recap

A summary of Summaries of Summaries

  1. LLM Performance and Practical Applications:

    • Gemini 1.5 Pro/Advanced models from Google impressed with top leaderboard positions, outperforming models like Llama-3-70b, while Codestral 22B from MistralAI supports 80+ programming languages targeting AI engineers.
  1. Fine-Tuning, Prompt Engineering, and Model Optimization:

    • Engineers discussed Gradient Accumulation and DPO training methods, emphasizing the role of ref_model in maintaining consistency during fine-tuning, and tackled quantization libraries for efficient use across different systems.

    • Techniques to solve prompt engineering challenges like handling "RateLimit" errors using try/except structures and fine-tuning models for specific domains were shared, underscoring practical solutions (example).

    • Members debated the use of transformers versus MLPs, highlighting findings that MLPs may handle certain tasks better, and discussed model-specific issues like context length and optimizer configurations in ongoing fine-tuning efforts.

  2. Open-Source Contributions and AI Community Collaboration:

    • OpenAccess AI Collective tackled spam issues, proposed updates for gradient checkpointing in Unsloth, and saw community-led initiatives on fine-tuning LLMs for image and video content comprehension.

    • LlamaIndex contributed to open-source by merging into the Neo4j ecosystem, focusing on integrating tools like PropertyGraphIndex for robust knowledge graph solutions.

    • Discussions emphasized community efforts around Llama3 model training and collaborative issues submitted on GitHub for libraries like axolotl and torchao indicating ongoing developments and shared problem resolutions.

  3. Model Deployment and Infrastructure Issues:

    • Engineers grappled with Google Colab disconnections, Docker setup for deployment issues, and the performance benefits of using Triton kernels on NVIDIA A6000 GPUs.

    • Lighting AI Studio was recommended for free GPU hours, while discussions on split GPU resources for large model productivity and tackling hardware bottlenecks highlighted user challenges.

    • ROC and NVIDIA compatibility setbacks were discussed, with practical suggestions to overcome them, like seeking deals on 7900 XT for expanded VRAM setups to support larger models and transitions from macOS x86 to M1.

  4. Challenges, Reactions, and Innovations in AI:

    • Helen Toner's revelations on OpenAI’s management sparked debates about transparency, raising concerns about internal politics and ethical AI development (Podcast link).

    • Elon Musk's xAI securing $6 billion in funding triggered discussions on the implications for AI competitiveness and infrastructure investment, while community members debated model pricing strategies and their potential impact on long-term investments in technologies.

    • Cohere API sparked discussions around effective use for grounded generation and ensuring force-citation display, showing active community engagement in leveraging new models for practical use cases.

PART 1: High level Discord summaries


{% if medium == 'web' %}

Perplexity AI Discord


HuggingFace Discord


Unsloth AI (Daniel Han) Discord


LLM Finetuning (Hamel + Dan) Discord

Fine-Tuning Frustrations and Marketplace Musings: Engineers discussed fine-tuning challenges, with concerns over Google's Gemini 1.5 API price hike and difficulties serving fine-tuned models in production. A channel dedicated to LLM-related job opportunities was proposed, and the need for robust JSON/Parquet file handling tools was highlighted.

Ins and Outs of Technical Workshops: Participants exchanged insights on LLM fine-tuning strategies, with emphasis on personalized sales emails and legal document summarization. The practicality of multi-agent LLM collaboration and the optimization of prompts for Stable Diffusion were debated.

Exploring the AI Ecosystem: The community delved into a variety of AI topics, revealing Braintrust as a handy tool for evaluating non-deterministic systems and the O'Reilly Radar insights on the complexities of building with LLMs. Discussions also highlighted the potential of Autoevals for SQL query evaluations.

Toolshed for LLM Work: Engineers tackled practical issues like Modal's opaque failures and Axolotl preprocessing GPU support problems. Queries around using shared storage on Jarvislabs and insights into model quantization on Wing Axolotl were shared, with useful resources and tips sprinkled throughout the discussions.

Code, Craft, and Communities: The community vibe flourished with talk of LLM evaluator models, the desirability of Gradio's UI over Streamlit, and the convening of meet-ups from San Diego to NYC. The vibrant exchanges covered technical ground but also nurtured the social fabric of the AI engineering realm.


CUDA MODE Discord

GPGPU Programming Embraces lighting.ai: Engineers discussed lighting.ai as a commendable option for GPGPU programming, especially for those lacking access to NVIDIA hardware commonly used for CUDA and SYCL development.

Easing Triton Development: Developers found triton_util, a utility package simplifying Triton kernel writing, useful for abstracting repetitive tasks, promoting a more intuitive experience. Performance leaps using Triton on NVIDIA A6000 GPUs were observed, while tackling bugs became a focus when dealing with large tensors above 65GB.

Nightly Torch Supports Python 3.12: The PyTorch community highlighted torch.compile issues on Python 3.12, with nightly builds providing some resolutions. Meanwhile, the deprecation of macOS x86 builds in Torch 2.3 sparked discussions about transitioning to the M1 chips or Linux.

Tom Yeh Enhances AI Fundamentals: Prof Tom Yeh is gaining traction by sharing hand calculation exercises on AI concepts. His series comprises Dot Product, Matrix Multiplication, Linear Layer, and Activation workbooks.

Quantum Leaps in Quantization: Engineers are actively discussing and improving quantization processes with libraries like bitsandbytes and fbgemm_gpu, as well as participating in competitions such as NeurIPS. Efforts on Llama2-7B and the FP6-LLM repository updates were shared alongside appreciating the torchao community's supportive nature.

CUDA Debugging Skills Enhanced: A single inquiry about debugging SYCL code was shared, highlighting the need for tools to improve kernel code analysis and possibly stepping into the debugging process.

Turbocharge Development with bitnet PRs: Various technical issues were addressed in the bitnet channel, including ImportError challenges related to mismatches between PyTorch/dev versions and CUDA, and compilation woes on university servers resolved via a gcc 12.1 upgrade. Collaborative PR work on bit packing and CI improvements were discussed, with resources provided for bit-level operations and error resolution (BitBlas on GitHub, ao GitHub issue).

Social and Techno Tales of Berlin and Seattle: Conversations in off-topic contrasted the social and weather landscapes of Seattle and Berlin. Berlin was touted for its techno scene and startup friendliness, moderated by its own share of gloomy weather.

Tokenizer Tales and Training Talk: An extensive dialog on self-implementing tokenizers and dataset handling ensued, considering compression and cloud storage options. Large-scale training on H100 GPUs remains cost-prohibitive, while granular discussions on GPU specs informed model optimization. Training experiments continue apace, with one resembling GPT-3's strength.


Nous Research AI Discord

Playing with Big Contexts: An engineer suggested training a Large Language Model (LLM) with an extremely long context window with the notion that with sufficient context, an LLM can predict better even with a smaller dataset.

The Unbiased Evaluation Dilemma: Concerns were raised about Scale’s involvement with both supplying data for and evaluating machine learning models, highlighting a potential conflict of interest that could influence the impartiality of model assessments.

Understanding RAG Beyond the Basics: Technical discussions elucidated the complexities of Retrieal-Augmented Generation (RAG) systems, stressing that it's not just a vector similarity match but involves a suite of other processes like re-ranking and full-text searches, as highlighted by discussions and resources like RAGAS.

Doubled Prices and Doubled Concerns: Google's decision to increase the price for Gemini 1.5 Flash output sparked a heated debate, with engineers calling out the unsustainable pricing strategy and questioning the reliability of the API’s cost structure.

Gradient Accumulation Scrutiny: A topic arose around avoiding gradient accumulation in model training, with engineers referring to Google's tuning playbook for insights, while also discussing the concept of ref_model in DPO training as per Hugging Face's documentation.


LM Studio Discord


Modular (Mojo 🔥) Discord

Mojo Gets a Memory Lane: A blog post illuminated Mojo's approach to memory management with ownership as a central focus, advocating a safe yet high-performance programming model. Chris Lattner's video was highlighted as a resource for digging deeper into the ownership concept within Mojo's compiler systems. Read more about it in their blog entry.

Alignment Ascendancy: Engineers have stressed the importance of 64-byte alignment in tables to utilize the full potency of AVX512 instructions and enhance caching efficiency. They also highlighted the necessity of alignment to prompt the prefetcher's optimal performance and the issues of false sharing in multithreaded contexts.

Optional Dilemmas and Dict Puzzles in Mojo: In the nightly branch conversations, the use of Optional with the ref API sparked extensive discussion, with participants considering Rust's ? operator as a constructive comparison. A related GitHub issue also focused on a bug with InlineArray failing to invoke destructors of its elements.

The Prose of Proposals and Compilations: The merits of naming conventions within auto-dereferenced references were rigorously debated, with the idea floated to rename Reference to TrackedPointer and Pointer to UntrackedPointer. Additionally, the latest nightly Mojo compiler release 2024.5.2912 brought updates like async function borrow restrictions with a comprehensive changelog available.

AI Expands Horizons in Open-World Gaming: An assertion was raised that open-world games could reach new pinnacles if AI could craft worlds dynamically from a wide range of online models, responding to user interactions. This idea suggests a significant opportunity for AI's role in gaming advancements.


Eleuther Discord


OpenAI Discord


Interconnects (Nathan Lambert) Discord


Stability.ai (Stable Diffusion) Discord


LlamaIndex Discord


Latent Space Discord


OpenRouter (Alex Atallah) Discord


LAION Discord


LangChain AI Discord

Educational content on routing logic in agent flows using LangChain was disseminated via a YouTube tutorial, assisting community members in enhancing their automated agents' decision-making pathways.


OpenInterpreter Discord


OpenAccess AI Collective (axolotl) Discord

"Not Safe for Work" Spam Cleanup: Moderators in the OpenAccess AI Collective (axolotl) swiftly responded to an alert regarding NSFW Discord invite links being spammed across channels, with the spam promptly addressed.

Quest for Multi-Media Model Mastery: An inquiry about how to fine-tune large language models (LLMs) like LLava models for image and video comprehension was posed in the general channel, yet it remains unanswered.

Gradient Checkpointing for MoE: A member of the axolotl-dev channel proposed an update to Unsloth's gradient checkpointing to support MoE architecture, with a pull request (PR) upcoming after verification.

Bug Hunt for Bin Packing: A development update pointed to an improved bin packing algorithm, but highlighted an issue where training stalled post-evaluation, likely linked to the new sampler's missing _len_est implementation.

Sampler Reversion Pulls Interest: A code regression was indicated by sharing a PR to revert multipack batch sampler changes due to flawed loss calculations, indicating the importance of precise metric evaluation in model training.


Cohere Discord

Rethinking PDF Finetuning with RAG: A member proposed Retrieval Augmented Generation (RAG) as a smarter alternative to traditional JSONL finetuning for handling PDFs, claiming it can eliminate the finetuning step entirely.

API-Specific Grounded Generation Insights: API documentation was cited to show how to use the response.citations feature within the grounded generation framework, and an accompanying Hugging Face link was provided as a reference.

Local R+ Innovation with Forced Citations: An engineer shared a hands-on achievement in integrating a RAG pipeline with forced citation display within a local Command R+ setup, demonstrating a reliable way to maintain source attributions.

Cohere's Discord Bot Usage Underlines Segmented Discussions: Enthusiasm around a Discord bot powered by Cohere sparked a reminder to keep project talk within its dedicated channel to maintain order and focus within the community discussions.

Channel Etiquette Encourages Project Segregation: Recognition for a community-built Discord bot was followed by guidance to move detailed discussions to a specified project channel, ensuring adherence to the guild's organizational norms.


tinygrad (George Hotz) Discord

xAI Secures a Whopping $6 Billion: Elon Musk's xAI has successfully raised $6 billion, with notable investors such as Andreessen Horowitz and Sequoia Capital. The funds are aimed at market introduction of initial products, expansive infrastructure development, and advancing research and development of future technologies.

Skepticism Cast on Unnamed Analytical Tools: A guild member expressed skepticism about certain analytical tools, considering them to have "negligible usefulness," although they did not specify which tools were under scrutiny.

New Language Bend Gains Attention: The Bend programming language was acclaimed for its ability to "automatically multi-thread without any code," a feature that complements tinygrad's lazy execution strategy, as shown in a Fireship video.

tinybox Power Supply Query: A question arose about the power supply requirements for tinybox, inquiring whether it utilizes "two consumer power supplies or two server power supplies with a power distribution board," but no resolution was provided.

Link Spotlight: An article from The Verge on xAI’s funding notably asks what portion of that capital will be allocated to acquiring GPUs, a key concern for AI Engineers regarding compute infrastructure.


DiscoResearch Discord


Mozilla AI Discord

Windows Woes with Llamafile: An engineer encountered an issue while compiling llamafile on Windows, pointing out a problem with cosmoc++ where the build fails due to executables not launching without a .exe suffix. Despite the system reporting a missing file, the engineer confirmed its presence in the directory .cosmocc/3.3.8/bin, and faced the same issue using cosmo bash.


Datasette - LLM (@SimonW) Discord


MLOps @Chipro Discord

A Peek Into the Technical Exchange: A user briefly mentioned finding a paper relevant to their interests, thanking another for sharing, and expressed intent to review it. However, no details about the paper's content, title, or field of study were provided.


The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI Stack Devs (Yoko Li) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1007 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (3 messages):


Perplexity AI ▷ #pplx-api (2 messages):


HuggingFace ▷ #general (951 messages🔥🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (2 messages):


HuggingFace ▷ #cool-finds (14 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (8 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (9 messages🔥):


HuggingFace ▷ #computer-vision (18 messages🔥):

Links mentioned:


HuggingFace ▷ #NLP (1 messages):


HuggingFace ▷ #diffusion-discussions (4 messages):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (656 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (3 messages):


Unsloth AI (Daniel Han) ▷ #help (59 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #general (74 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #workshop-1 (4 messages):


LLM Finetuning (Hamel + Dan) ▷ #asia-tz (3 messages):


LLM Finetuning (Hamel + Dan) ▷ #🟩-modal (21 messages🔥):

Link mentioned: Sign in: Welcome back to Modal! Sign in to your Modal account by selecting an identity provider below.


LLM Finetuning (Hamel + Dan) ▷ #learning-resources (6 messages):

Link mentioned: What We Learned from a Year of Building with LLMs (Part I): no description found


LLM Finetuning (Hamel + Dan) ▷ #jarvis-labs (8 messages🔥):

Link mentioned: JLClient | Jarvislabs : JLClient is a Python API for Interacting with Jarvislabs.ai for the complete lifecycle of GPU instances.


LLM Finetuning (Hamel + Dan) ▷ #hugging-face (15 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #ankurgoyal_textsql_llmevals (53 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #berryman_prompt_workshop (141 messages🔥🔥):

- **Highly recommend John Berryman's book**: John Berryman's Prompt Engineering book on O'Reilly promises to be a comprehensive guide for developers, solidifying LLM principles and prompt engineering techniques useful for practical applications. Discover it [here](https://learning.oreilly.com/library/view/prompt-engineering-for/9781098156145/).
- **Exploring Prompt Engineering tools and frameworks**: Members shared numerous resources including links to [Hamel's notes](https://hamel.dev/notes/llm/openai/func_template.html), GoEx and reflection agent techniques via [Langchain blog](https://blog.langchain.dev/reflection-agents/), and JSON Schema details on [Notion](https://www.notion.so/matijagrcic/JSON-Schema-78055af9ce1242e8b9be27918056be2f?pvs=4).
- **Interesting insights about LLM behavior and tuning**: Members discussed how underlying principles of computation give rise to capabilities of LLMs, including references to chaining reasoning and action through frameworks like ReAct. Check the paper [ReAct: Synergizing Reasoning and Acting in Language Models](https://www.promptingguide.ai/techniques/react).
- **Copilot chatbot tips**: Several members shared experiences with AI-assisted coding tools like GitHub Copilot and Cursor, recommending examining workspace context and inline chat utilities. See [Copilot workspace context](https://code.visualstudio.com/docs/copilot/workspace-context#_tips-for-using-workspace) for optimizing workspace-based inquiries.
- **Function calling and evaluation techniques**: Discussions surfaced prompted discussions about leveraging frameworks/tools like [Anthropic's XML tags](https://docs.anthropic.com/en/docs/use-xml-tags) and how to dynamically select few-shot examples via libraries that compute Levenshtein distances or embeddings.

Links mentioned:

  Tool Invocation – Demonstrating the Marvel of GPT's Flexibility · Thought Box

: no description foundNotion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your teamUse XML tags - Anthropic: no description foundHamel’s Blog - Function prompts: How is OpenAI formatting its prompt for function calls?Relevant Search: Relevant Search</i> demystifies relevance work. Using Elasticsearch, it teaches you how to return engaging search results to your users, helping you understand and leverage the internals of Luce...Gorilla: no description foundPrompt Engineering v2 (Compressed): Prompt Engineering John BerrymanPrompt Engineering for LLMs: Large language models (LLMs) promise unprecedented benefits. Well versed in common topics of human discourse, LLMs can make useful contributions to a large variety of tasks, especially now that the … ...Tweet from undefined: no description foundPrompt Engineering for LLMs: Large language models (LLMs) promise unprecedented benefits. Well versed in common topics of human discourse, LLMs can make useful contributions to a large variety of tasks, especially now that the … ...Tweet from undefined: no description foundLanguage Models are Few-Shot Learners: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in a...Chain-of-Thought Prompting Elicits Reasoning in Large Language Models: We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we ...Large Language Models are Zero-Shot Reasoners: Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably...Gorilla: no description foundTweet from Eric Hartford (@erhartford): Cognitive Computations presents Dolphin-2.9.2-Mixtral-8x22b, trained with a new dataset SystemChat 2.0, designed to teach Dolphin to obey the System Prompt, even over a long conversation. This releas...


LLM Finetuning (Hamel + Dan) ▷ #whitaker_napkin_math (1 messages):

computer_internet_man: 🧠🍿


LLM Finetuning (Hamel + Dan) ▷ #workshop-2 (5 messages):


LLM Finetuning (Hamel + Dan) ▷ #workshop-3 (199 messages🔥🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #yang_mistral_finetuning (1 messages):

init27_sanyam: We have more stuff to ask about 😄 https://mistral.ai/news/codestral/


LLM Finetuning (Hamel + Dan) ▷ #gradio (1 messages):


LLM Finetuning (Hamel + Dan) ▷ #axolotl (17 messages🔥):

Link mentioned: axolotl/src/axolotl/prompters.py at 8a20a7b711a62d7b04e742f3d6034b4ca8aa27d2 · OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.


LLM Finetuning (Hamel + Dan) ▷ #wing-axolotl (25 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #freddy-gradio (8 messages🔥):

Link mentioned: Limit oauth logins · Issue #8405 · gradio-app/gradio: I have searched to see if a similar issue already exists. Received this question about logging in with HF on discord. Posting here for visibility: Can you limit the list of allowed logins (username...


LLM Finetuning (Hamel + Dan) ▷ #allaire_inspect_ai (24 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #credits-questions (46 messages🔥):

Links mentioned:


LLM Finetuning (Hamel + Dan) ▷ #eugeneyan_evaluator_model (3 messages):

- **Discussion Hub Redirect**: Members identified a primary channel for questions on finetuning, suggesting that most queries might be happening in [this channel](https://discord.com/channels/1238365980128706560/1245100755787186298).
- **Training Summarization Evaluator Models**: One member shared their appreciation for a recent talk on improving summarization models by first training on a larger set (USB) before fine-tuning on a smaller, targeted dataset (FIB). The takeaway is that this method significantly boosts the evaluator model's performance on the specific dataset they care about, highlighting how "training on an additional dataset followed by the dataset we care about drastically improves performance."

LLM Finetuning (Hamel + Dan) ▷ #fireworks (9 messages🔥):

Link mentioned: Fireworks Credits - Mastering LLMs : A Conference For Developers & Data Scientists: Please fill the below form to get $250 Fireworks credits! Join our discord for questions/help or more credits ;) https://discord.gg/fireworks


LLM Finetuning (Hamel + Dan) ▷ #braintrust (3 messages):

- **Greetings flood the channel**: Members exchanged greetings with each other. *"Hello all 👋,"* one member said, receiving a wave of *"👋🏽" and "hi"* in response.

LLM Finetuning (Hamel + Dan) ▷ #west-coast-usa (7 messages):

Link mentioned: An evening with three AI investors · Luma: Please join us on Thursday May 30th at Solaris AI for a panel discussion about investing in AI startups. Our panelists are: - Yoko Li - Josh Buckley - Lenny…


LLM Finetuning (Hamel + Dan) ▷ #east-coast-usa (14 messages🔥):


LLM Finetuning (Hamel + Dan) ▷ #europe-tz (25 messages🔥):


LLM Finetuning (Hamel + Dan) ▷ #announcements (3 messages):

Links mentioned:


CUDA MODE ▷ #general (3 messages):


CUDA MODE ▷ #triton (16 messages🔥):

Links mentioned:


CUDA MODE ▷ #torch (19 messages🔥):

Links mentioned:


CUDA MODE ▷ #cool-links (1 messages):

Links mentioned:


CUDA MODE ▷ #torchao (19 messages🔥):

Link mentioned: GitHub - microsoft/BitBLAS: BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.: BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment. - microsoft/BitBLAS


CUDA MODE ▷ #off-topic (26 messages🔥):

- **Seattle disappoints due to gloomy weather**: A user shared their negative experience of living in Seattle, stating it to be "the least social city" due to the dark and rainy weather for about 9 months a year. They emphasized that while Seattle is beautiful in the summer, it can be quite isolating during the rest of the year due to weather conditions.

- **Berlin shines with hacker/startup community**: Another user pointed out that Berlin has a vibrant hacker/startup community and everyone speaks English, making it easier for newcomers. They specifically mentioned Berlin’s appeal to those interested in techno parties and local cuisine like kebabs.

- **Berlin weather reality check**: Contrary to the idyllic images of Berlin shared, users warned about the long gloomy winters, with temperatures dropping as low as -10 °C. However, they noted that the spring and summer periods in Berlin are very enjoyable.

- **Tech scene in Berlin and career advice**: Suggestions included working at small startups or companies like Amazon and Zalando if moving to Berlin. However, they advised gaining big tech experience in cities like SF or NYC for better future opportunities, such as raising funding for startups.

Link mentioned: Tweet from Isa Rus (@Isarusphoto): Berlin in February


CUDA MODE ▷ #llmdotc (215 messages🔥🔥):

Links mentioned:

</a>: no description found</li><li><a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb/blob/main/eval_results.csv">eval_results.csv · HuggingFaceFW/fineweb at main</a>: no description found</li><li><a href="https://aws.amazon.com/s3/pricing/?p=pm&c=s3&z=4">Amazon S3 Simple Storage Service Pricing - Amazon Web Services</a>: no description found</li><li><a href="https://zenodo.org/">Zenodo</a>: no description found</li><li><a href="https://github.com/karpathy/llm.c/pull/487">`softmax_autoregressive_backward_kernel` does not use share memory in the kernel by huoyushequ · Pull Request #487 · karpathy/llm.c</a>: softmax_autoregressive_backward_kernel does not use share memory in the kernel. we do not need to launch the kernel with 256 bytes share memory, so remove it</li><li><a href="https://zenodo.org/records/3834942">OpenWebText</a>: An open-source replication of the WebText dataset from OpenAI. For more info please visit https://skylion007.github.io/OpenWebTextCorpus/ @misc{Gokaslan2019OpenWeb, title={OpenWebText Corpus}, author=...</li><li><a href="https://www.techpowerup.com/gpu-specs/rtx-a5500.c3901">NVIDIA RTX A5500 Specs</a>: NVIDIA GA102, 1665 MHz, 10240 Cores, 320 TMUs, 96 ROPs, 24576 MB GDDR6, 2000 MHz, 384 bit

CUDA MODE ▷ #oneapi (1 messages):

orion160: What are tools to debug SYCL code? In general stepping into kernel code....


CUDA MODE ▷ #bitnet (94 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ctx-length-research (2 messages):


Nous Research AI ▷ #off-topic (12 messages🔥):


Nous Research AI ▷ #interesting-links (9 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (256 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (16 messages🔥):

Links mentioned:


Nous Research AI ▷ #rag-dataset (15 messages🔥):

Links mentioned:


Nous Research AI ▷ #world-sim (6 messages):


LM Studio ▷ #💬-general (62 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (19 messages🔥):

Links mentioned:


LM Studio ▷ #📝-prompts-discussion-chat (5 messages):


LM Studio ▷ #⚙-configs-discussion (3 messages):


LM Studio ▷ #🎛-hardware-discussion (92 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (2 messages):


LM Studio ▷ #amd-rocm-tech-preview (9 messages🔥):

Link mentioned: Gigabyte AMD Radeon RX 7900 XT GAMING OC Graphics Card for Gaming - 20GB | Ebuyer.com: no description found


LM Studio ▷ #model-announcements (1 messages):


Modular (Mojo 🔥) ▷ #general (75 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (1 messages):

ModularBot: From Modular: https://twitter.com/Modular/status/1795883558608973828


Modular (Mojo 🔥) ▷ #✍︱blog (1 messages):

Link mentioned: Modular: What Ownership is Really About: A Mental Model Approach: We are building a next-generation AI developer platform for the world. Check out our latest post: What Ownership is Really About: A Mental Model Approach


Modular (Mojo 🔥) ▷ #tech-news (1 messages):


Modular (Mojo 🔥) ▷ #🔥mojo (35 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (7 messages):


Modular (Mojo 🔥) ▷ #nightly (53 messages🔥):

Links mentioned:


Eleuther ▷ #general (24 messages🔥):

Links mentioned:


Eleuther ▷ #research (43 messages🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (90 messages🔥🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (9 messages🔥):

Link mentioned: gist:0004bf39a3cec65262cf72f556c316c4: GitHub Gist: instantly share code, notes, and snippets.


OpenAI ▷ #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (100 messages🔥🔥):


OpenAI ▷ #gpt-4-discussions (30 messages🔥):


OpenAI ▷ #prompt-engineering (3 messages):


OpenAI ▷ #api-discussions (3 messages):


Interconnects (Nathan Lambert) ▷ #news (60 messages🔥🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-drama (30 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (4 messages):


Interconnects (Nathan Lambert) ▷ #memes (10 messages🔥):


Interconnects (Nathan Lambert) ▷ #rl (3 messages):

Link mentioned: GitHub - aalmuzairee/dmcgb2: Official release of the DMControl Generalization Benchmark 2 (DMC-GB2): Official release of the DMControl Generalization Benchmark 2 (DMC-GB2) - GitHub - aalmuzairee/dmcgb2: Official release of the DMControl Generalization Benchmark 2 (DMC-GB2)


Interconnects (Nathan Lambert) ▷ #posts (7 messages):


Interconnects (Nathan Lambert) ▷ #retort-podcast (5 messages):

Link mentioned: Tweet from Andrew Carr (e/🤸) (@andrew_n_carr): cool new alignment research from OpenAI. they generate synthetic data that encourages "instruction hierarchy" where system prompts are treated as more important by the model. this then pre...


Stability.ai (Stable Diffusion) ▷ #general-chat (117 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #announcements (1 messages):

Link mentioned: Tweet from LlamaIndex 🦙 (@llama_index): We’re excited to launch a huge feature making @llama_index the framework for building knowledge graphs with LLMs: The Property Graph Index 💫 (There’s a lot of stuff to unpack here, let’s start from ...


LlamaIndex ▷ #blog (5 messages):

- **FinTextQA dataset converges on finance**: The FinTextQA dataset offers **1,262 high-quality, source-attributed question-answer pairs** and covers six different question types. It provides a robust context for document-based financial question answering [source](https://t.co/emhQYXY1S4).
- **PostgresML integrates with LlamaIndex**: If you're into Postgres and AI applications, check out [PostgresML](https://t.co/G7WTrSdt0B). It allows for **local embedding, model training, and fine-tuning** in Python and JavaScript.
- **LlamaIndex launches the Property Graph Index**: The Property Graph Index offers new tools for constructing and querying knowledge graphs with LLMs (**Large Language Models**). This new feature aims to position LlamaIndex as a comprehensive framework for building knowledge graphs [source](https://t.co/X9D3Wl0Hyv).
- **Codestral code-gen model now available**: The new **Codestral** model from MistralAI supports over **80 programming languages** and can run locally. LlamaIndex offers **day 0 support** along with a detailed [notebook](https://t.co/k2nHDiMnwD) to demonstrate its usage.
- **Ollama enhances Codestral support**: As a bonus, the Codestral model is fully supported by [Ollama](https://t.co/gsPHHF4c0K), enabling users to run it locally with first-class support.

LlamaIndex ▷ #general (107 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (72 messages🔥🔥):

- **Gemini 1.5 impresses with performance**: After the release of the Gemini 1.5 results, it was noted that **Gemini 1.5 Pro/Advanced** ranks second, closely trailing GPT-4o, and **Gemini 1.5 Flash** ranks ninth, outperforming models like Llama-3-70b. The comprehensive breakdown can be found on [LMSysOrg's Twitter](https://x.com/lmsysorg/status/1795512202465845686).

- **Insights from building with LLMs**: The article "[What We Learned from a Year of Building with LLMs](https://www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/)" discusses the rapid advancement of LLMs and the challenges in building effective AI products beyond demos.

- **Excitement over SWE-agent's potential**: After Princeton researchers unveiled the **SWE-agent**, claims about its superior performance and its open-source nature sparked interest. More details were shared on [Gergely Orosz's Twitter](https://x.com/GergelyOrosz/status/1794743519954731331) and the [SWE-agent GitHub](https://github.com/princeton-nlp/SWE-agent).

- **New open-source VLM model - Llama3-V**: The **Llama3-V** model claims to outperform **LLaVA** and compete closely with models like GPT4-V, emphasizing its efficiency with a significantly smaller model size. Details and access links were provided on [Sidd Rsh's Twitter](https://x.com/siddrrsh/status/1795541002620727439).

- **Scale announces SEAL Leaderboards for LLM evaluations**: **Scale's SEAL Leaderboards** aims to offer private, expert evaluations to ensure robust and non-exploitable model assessments. The initiative was highlighted by [Alexandr Wang](https://x.com/alexandr_wang/status/1795857651592491281) and received commendation from [Andrej Karpathy](https://x.com/karpathy/status/1795873666481402010).

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

Link mentioned: LLM Paper Club (AI Agent Architectures + Kolmogorov Arnold Networks) · Zoom · Luma: a 2-for-1! Eric Ness will cover https://arxiv.org/abs/2404.11584 (The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A…


Latent Space ▷ #llm-paper-club-west (2 messages):

There are no messages to summarize for the channel llm-paper-club-west.

OpenRouter (Alex Atallah) ▷ #announcements (2 messages):


OpenRouter (Alex Atallah) ▷ #general (51 messages🔥):

Links mentioned:


LAION ▷ #general (23 messages🔥):

Link mentioned: Reddit - Dive into anything: no description found


LAION ▷ #research (17 messages🔥):

Link mentioned: Phased Consistency Model: The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in ...


LangChain AI ▷ #general (26 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (1 messages):

Link mentioned: langserve/examples/chat_with_persistence_and_user/client.ipynb at main · langchain-ai/langserve: LangServe 🦜️🏓. Contribute to langchain-ai/langserve development by creating an account on GitHub.


LangChain AI ▷ #share-your-work (1 messages):

Link mentioned: How to Route Logic in Your Agent Flows: Simple example of how to use routing logic in your agent flows with Visual Agents, built on LangChain.https://visualagents.aihttps://langchain.ai


OpenInterpreter ▷ #general (18 messages🔥):

Link mentioned: All Settings - Open Interpreter: no description found


OpenInterpreter ▷ #O1 (3 messages):


OpenInterpreter ▷ #ai-content (1 messages):

mikebirdtech: https://www.youtube.com/watch?v=sqwtk18pw14


OpenAccess AI Collective (axolotl) ▷ #general (4 messages):


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (9 messages🔥):

Link mentioned: revert multipack batch sampler changes by winglian · Pull Request #1672 · OpenAccess-AI-Collective/axolotl: The loss isn't quite right w/ #1619, off by an order of magnitude.


Cohere ▷ #general (6 messages):

Link mentioned: CohereForAI/c4ai-command-r-plus · Hugging Face: no description found


tinygrad (George Hotz) ▷ #general (4 messages):

Link mentioned: Elon Musk’s xAI raises $6 billion to fund its race against ChatGPT and all the rest: How much of that money is going to be spent on GPUs?


DiscoResearch ▷ #general (4 messages):

Links mentioned:


Mozilla AI ▷ #llamafile (3 messages):


Datasette - LLM (@SimonW) ▷ #llm (2 messages):

- **Retrieval Augmented Generation can solve hallucination**: A member mentioned frequently using **LLMs** to answer documentation-related questions but facing issues with hallucinations and inaccuracies. They suggested that *pulling the docs, storing embeddings, and using similarity search ("Retrieval Augmented Generation")* could mitigate this and inquired about extending `llm` to create embeddings for a URL recursively.

MLOps @Chipro ▷ #general-ml (1 messages):

yellowturmeric: I haven't. thanks for sharing. I'll take a read of this paper.





{% else %}

Perplexity AI Discord


HuggingFace Discord


Unsloth AI (Daniel Han) Discord


LLM Finetuning (Hamel + Dan) Discord

Fine-Tuning Frustrations and Marketplace Musings: Engineers discussed fine-tuning challenges, with concerns over Google's Gemini 1.5 API price hike and difficulties serving fine-tuned models in production. A channel dedicated to LLM-related job opportunities was proposed, and the need for robust JSON/Parquet file handling tools was highlighted.

Ins and Outs of Technical Workshops: Participants exchanged insights on LLM fine-tuning strategies, with emphasis on personalized sales emails and legal document summarization. The practicality of multi-agent LLM collaboration and the optimization of prompts for Stable Diffusion were debated.

Exploring the AI Ecosystem: The community delved into a variety of AI topics, revealing Braintrust as a handy tool for evaluating non-deterministic systems and the O'Reilly Radar insights on the complexities of building with LLMs. Discussions also highlighted the potential of Autoevals for SQL query evaluations.

Toolshed for LLM Work: Engineers tackled practical issues like Modal's opaque failures and Axolotl preprocessing GPU support problems. Queries around using shared storage on Jarvislabs and insights into model quantization on Wing Axolotl were shared, with useful resources and tips sprinkled throughout the discussions.

Code, Craft, and Communities: The community vibe flourished with talk of LLM evaluator models, the desirability of Gradio's UI over Streamlit, and the convening of meet-ups from San Diego to NYC. The vibrant exchanges covered technical ground but also nurtured the social fabric of the AI engineering realm.


CUDA MODE Discord

GPGPU Programming Embraces lighting.ai: Engineers discussed lighting.ai as a commendable option for GPGPU programming, especially for those lacking access to NVIDIA hardware commonly used for CUDA and SYCL development.

Easing Triton Development: Developers found triton_util, a utility package simplifying Triton kernel writing, useful for abstracting repetitive tasks, promoting a more intuitive experience. Performance leaps using Triton on NVIDIA A6000 GPUs were observed, while tackling bugs became a focus when dealing with large tensors above 65GB.

Nightly Torch Supports Python 3.12: The PyTorch community highlighted torch.compile issues on Python 3.12, with nightly builds providing some resolutions. Meanwhile, the deprecation of macOS x86 builds in Torch 2.3 sparked discussions about transitioning to the M1 chips or Linux.

Tom Yeh Enhances AI Fundamentals: Prof Tom Yeh is gaining traction by sharing hand calculation exercises on AI concepts. His series comprises Dot Product, Matrix Multiplication, Linear Layer, and Activation workbooks.

Quantum Leaps in Quantization: Engineers are actively discussing and improving quantization processes with libraries like bitsandbytes and fbgemm_gpu, as well as participating in competitions such as NeurIPS. Efforts on Llama2-7B and the FP6-LLM repository updates were shared alongside appreciating the torchao community's supportive nature.

CUDA Debugging Skills Enhanced: A single inquiry about debugging SYCL code was shared, highlighting the need for tools to improve kernel code analysis and possibly stepping into the debugging process.

Turbocharge Development with bitnet PRs: Various technical issues were addressed in the bitnet channel, including ImportError challenges related to mismatches between PyTorch/dev versions and CUDA, and compilation woes on university servers resolved via a gcc 12.1 upgrade. Collaborative PR work on bit packing and CI improvements were discussed, with resources provided for bit-level operations and error resolution (BitBlas on GitHub, ao GitHub issue).

Social and Techno Tales of Berlin and Seattle: Conversations in off-topic contrasted the social and weather landscapes of Seattle and Berlin. Berlin was touted for its techno scene and startup friendliness, moderated by its own share of gloomy weather.

Tokenizer Tales and Training Talk: An extensive dialog on self-implementing tokenizers and dataset handling ensued, considering compression and cloud storage options. Large-scale training on H100 GPUs remains cost-prohibitive, while granular discussions on GPU specs informed model optimization. Training experiments continue apace, with one resembling GPT-3's strength.


Nous Research AI Discord

Playing with Big Contexts: An engineer suggested training a Large Language Model (LLM) with an extremely long context window with the notion that with sufficient context, an LLM can predict better even with a smaller dataset.

The Unbiased Evaluation Dilemma: Concerns were raised about Scale’s involvement with both supplying data for and evaluating machine learning models, highlighting a potential conflict of interest that could influence the impartiality of model assessments.

Understanding RAG Beyond the Basics: Technical discussions elucidated the complexities of Retrieal-Augmented Generation (RAG) systems, stressing that it's not just a vector similarity match but involves a suite of other processes like re-ranking and full-text searches, as highlighted by discussions and resources like RAGAS.

Doubled Prices and Doubled Concerns: Google's decision to increase the price for Gemini 1.5 Flash output sparked a heated debate, with engineers calling out the unsustainable pricing strategy and questioning the reliability of the API’s cost structure.

Gradient Accumulation Scrutiny: A topic arose around avoiding gradient accumulation in model training, with engineers referring to Google's tuning playbook for insights, while also discussing the concept of ref_model in DPO training as per Hugging Face's documentation.


LM Studio Discord


Modular (Mojo 🔥) Discord

Mojo Gets a Memory Lane: A blog post illuminated Mojo's approach to memory management with ownership as a central focus, advocating a safe yet high-performance programming model. Chris Lattner's video was highlighted as a resource for digging deeper into the ownership concept within Mojo's compiler systems. Read more about it in their blog entry.

Alignment Ascendancy: Engineers have stressed the importance of 64-byte alignment in tables to utilize the full potency of AVX512 instructions and enhance caching efficiency. They also highlighted the necessity of alignment to prompt the prefetcher's optimal performance and the issues of false sharing in multithreaded contexts.

Optional Dilemmas and Dict Puzzles in Mojo: In the nightly branch conversations, the use of Optional with the ref API sparked extensive discussion, with participants considering Rust's ? operator as a constructive comparison. A related GitHub issue also focused on a bug with InlineArray failing to invoke destructors of its elements.

The Prose of Proposals and Compilations: The merits of naming conventions within auto-dereferenced references were rigorously debated, with the idea floated to rename Reference to TrackedPointer and Pointer to UntrackedPointer. Additionally, the latest nightly Mojo compiler release 2024.5.2912 brought updates like async function borrow restrictions with a comprehensive changelog available.

AI Expands Horizons in Open-World Gaming: An assertion was raised that open-world games could reach new pinnacles if AI could craft worlds dynamically from a wide range of online models, responding to user interactions. This idea suggests a significant opportunity for AI's role in gaming advancements.


Eleuther Discord


OpenAI Discord


Interconnects (Nathan Lambert) Discord


Stability.ai (Stable Diffusion) Discord


LlamaIndex Discord


Latent Space Discord


OpenRouter (Alex Atallah) Discord


LAION Discord


LangChain AI Discord

Educational content on routing logic in agent flows using LangChain was disseminated via a YouTube tutorial, assisting community members in enhancing their automated agents' decision-making pathways.


OpenInterpreter Discord


OpenAccess AI Collective (axolotl) Discord

"Not Safe for Work" Spam Cleanup: Moderators in the OpenAccess AI Collective (axolotl) swiftly responded to an alert regarding NSFW Discord invite links being spammed across channels, with the spam promptly addressed.

Quest for Multi-Media Model Mastery: An inquiry about how to fine-tune large language models (LLMs) like LLava models for image and video comprehension was posed in the general channel, yet it remains unanswered.

Gradient Checkpointing for MoE: A member of the axolotl-dev channel proposed an update to Unsloth's gradient checkpointing to support MoE architecture, with a pull request (PR) upcoming after verification.

Bug Hunt for Bin Packing: A development update pointed to an improved bin packing algorithm, but highlighted an issue where training stalled post-evaluation, likely linked to the new sampler's missing _len_est implementation.

Sampler Reversion Pulls Interest: A code regression was indicated by sharing a PR to revert multipack batch sampler changes due to flawed loss calculations, indicating the importance of precise metric evaluation in model training.


Cohere Discord

Rethinking PDF Finetuning with RAG: A member proposed Retrieval Augmented Generation (RAG) as a smarter alternative to traditional JSONL finetuning for handling PDFs, claiming it can eliminate the finetuning step entirely.

API-Specific Grounded Generation Insights: API documentation was cited to show how to use the response.citations feature within the grounded generation framework, and an accompanying Hugging Face link was provided as a reference.

Local R+ Innovation with Forced Citations: An engineer shared a hands-on achievement in integrating a RAG pipeline with forced citation display within a local Command R+ setup, demonstrating a reliable way to maintain source attributions.

Cohere's Discord Bot Usage Underlines Segmented Discussions: Enthusiasm around a Discord bot powered by Cohere sparked a reminder to keep project talk within its dedicated channel to maintain order and focus within the community discussions.

Channel Etiquette Encourages Project Segregation: Recognition for a community-built Discord bot was followed by guidance to move detailed discussions to a specified project channel, ensuring adherence to the guild's organizational norms.


tinygrad (George Hotz) Discord

xAI Secures a Whopping $6 Billion: Elon Musk's xAI has successfully raised $6 billion, with notable investors such as Andreessen Horowitz and Sequoia Capital. The funds are aimed at market introduction of initial products, expansive infrastructure development, and advancing research and development of future technologies.

Skepticism Cast on Unnamed Analytical Tools: A guild member expressed skepticism about certain analytical tools, considering them to have "negligible usefulness," although they did not specify which tools were under scrutiny.

New Language Bend Gains Attention: The Bend programming language was acclaimed for its ability to "automatically multi-thread without any code," a feature that complements tinygrad's lazy execution strategy, as shown in a Fireship video.

tinybox Power Supply Query: A question arose about the power supply requirements for tinybox, inquiring whether it utilizes "two consumer power supplies or two server power supplies with a power distribution board," but no resolution was provided.

Link Spotlight: An article from The Verge on xAI’s funding notably asks what portion of that capital will be allocated to acquiring GPUs, a key concern for AI Engineers regarding compute infrastructure.


DiscoResearch Discord


Mozilla AI Discord

Windows Woes with Llamafile: An engineer encountered an issue while compiling llamafile on Windows, pointing out a problem with cosmoc++ where the build fails due to executables not launching without a .exe suffix. Despite the system reporting a missing file, the engineer confirmed its presence in the directory .cosmocc/3.3.8/bin, and faced the same issue using cosmo bash.


Datasette - LLM (@SimonW) Discord


MLOps @Chipro Discord

A Peek Into the Technical Exchange: A user briefly mentioned finding a paper relevant to their interests, thanking another for sharing, and expressed intent to review it. However, no details about the paper's content, title, or field of study were provided.


The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI Stack Devs (Yoko Li) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}