Frozen AI News archive

Music's Dall-E moment

**Google's Griffin architecture** outperforms transformers with faster inference and lower memory usage on long contexts. **Command R+** climbs to 6th place on the LMSYS Chatbot Arena leaderboard, surpassing **GPT-4-0613** and **GPT-4-0314**. **Mistral AI** releases an open-source **8x22B model** with a 64K context window and around 130B total parameters. **Google** open-sources **CodeGemma** models with pre-quantized 4-bit versions for faster downloads. **Ella weights** enhance Stable Diffusion 1.5 with LLM for semantic alignment. **Unsloth** enables 4x larger context windows and 80% memory reduction for finetuning. **Andrej Karpathy** releases LLMs implemented in pure C for potential performance gains. **Command R+** runs in realtime on M2 Max MacBook using iMat q1 quantization. **Cohere's Command R** model offers low API costs and strong leaderboard performance. **Gemini 1.5** impresses with audio capabilities recognizing speech tone and speaker identification from audio clips.

Canonical issue URL

While people are still processing the big Gemini audio and GPT4T and Mixtral news from yesterday, today was Udio's big launch:

image.png

You'll have to listen to the samples in thread to compare it with Suno, which of course has its own fandom. Udio has leaked like a sieve the last few days so it's no surprise, but more surprising was Sonauto also launching today also going after the music generation game, though far less polished. This feels like an idea whose time has finally come, though unlike with Latent Diffusion, it is unclear what breakthroughs enabled Suno/Udio/Sonauto all around the same time. You can hear some hints on Suno's Latent Space pod but that's all you'll get until we release the next music episode.


Table of Contents

[TOC]


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence. Comment crawling still not implemented but coming soon.

Here is a summary of the key themes and topics from the given Reddit posts, organized into categories with the most relevant posts linked:

AI Models and Architectures

Open Source Efforts

Benchmarks and Comparisons

Multimodal AI


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

GPT-4 Turbo Model Improvements

Mistral AI's New 8x22B Model Release

Google's New Model Releases and Announcements

Anthropic's Research on Model Persuasiveness

Cohere's Command R+ Model Performance

Meta's New AI Infrastructure and Chip Announcements

Humor and Memes


AI Discord Recap

A summary of Summaries of Summaries

1) New and Upcoming AI Model Releases and Benchmarks

2) Quantization, Efficiency, and Hardware Considerations

3) Open-Source Developments and Community Engagement

4) Prompt Engineering, Instruction Tuning, and Benchmarking Debates


PART 1: High level Discord summaries

Stability.ai (Stable Diffusion) Discord

Super-Resolution Squads Deploy Techniques: Engineers discussed enhancing image quality from video screenshots using super-resolution. They referenced RealBasicVSR, with many looking forward to more advanced video upscalers.

Stirring Stable Diffusion Creativity: Newcomers inquired about creating original content with Stable Diffusion, receiving guidance toward tools and repositories on GitHub. Contributions of demo URLs from experienced users further supported these explorations.

Custom Control Debates Heat Up: Participants debated the customizations within Stable Diffusion, including specific dataset construction, project enhancements, and stylized 'loras' to reflect distinct art styles, indicating a trend toward highly personalized model outputs.

Navigating the AI Legal Labyrinth: Conversations also hinged on the legal and ethical implications of AI-generated content, addressing copyright concerns, lawful generation practices, and potential impacts of legislative developments on the field.

Eager Anticipation for Stable Diffusion 3: There was significant buzz around the anticipated release of Stable Diffusion 3, with special attention to its hand-generation abilities and the question of whether newer models will need negative prompts to avoid undesirable outputs.


LM Studio Discord


Unsloth AI (Daniel Han) Discord


Perplexity AI Discord

Perplexity Pro Stirs Debate: Community members are dissecting the pros and cons of Perplexity Pro, particularly for learning tools like Blender and Unreal Engine, yet some users note limitations in context length compared to other services, with Gemini 1.5 standing out due to its video and audio support.

Model Comparisons and Speculations: Conversations are buzzing around Mistral 8x22b, an open-source model believed to slot between GPT-4 and Sonnet, though its heavy compute requirements limit accessibility. There's also a light-hearted banter about future models like "GPT-5" and "Gemini 2.0", paralleled with quips about the anticipated release of "GTA 6".

Tech Mashup: Raycast Meets Perplexity: An announced collaboration between Raycast and Perplexity AI aims to integrate knowledge access into the Mac user experience, as detailed in a tweet from Perplexity. Additionally, there's a mention of AI trumping traditional search engines for quick information retrieval.

Out of the Lab, Into the Code: A new Ruby client for the Perplexity API hit the scene, while users are sharing workarounds for large text pasting and model selection for data extraction, specifying an upper limit of 199k tokens.

Perplexity API Evolves: Technical issues like API balance top-ups and payment submission bugs were swiftly navigated, with fixes in place and an invitation for DMs if problems persist. Additionally, there's talk of the Perplexity API's capabilities with live web responses and clarity that the Claude Opus model is not currently supported.


Nous Research AI Discord

A Chatbot Refined: StableLM 2 12B Chat is a 12 billion parameter AI optimized for chat via Direct Preference Optimization (DPO), with the user base evaluating its implications compared to other finetuning methods like SFT+KTO and DNO; concerns revolve around quality and ethical considerations of DPO. StableLM 2's model is accessible here.

Mixtral's Rise to the Top: Early benchmarks suggest the Mixtral 8x22b model rivals top-tier models like Command R+ in MMLU evaluations, sparking discussions on the importance of diverse finetuning datasets vs inherited base model capabilities. More details on Mixtral 8x22b.

The Quantum Leap in Model Quantization: Insights were shared on quantization methods, particularly in the context of OLMo-Bitnet-1B with a focus on Quantization Aware Training (QAT) and the use of the Straight-Through Estimator, highlighting an ongoing interest in model efficiency. Here's the paper on the Straight-Through Estimator.

Synthesizing for Success: A paper introducing the concept of combining synthetic and real data during model training sparked debate over the potential for 'inbreeding' of synthetic data and its impact on diversity of models' knowledge bases and the risk of model collapse. The paper can be found here.

Anticipating WorldSim Updates: The community showed excitement about the upcoming updates to WorldSim, with discussions about the platform's multilingual support and alternatives that can simulate similar experiences using models like Nous Hermes Mixtral. Current local hardware was also highlighted as insufficient for running such advanced models.


Eleuther Discord

RNN Advancements Unraveled: Researchers demonstrate that interpretability tools used for transformers have significant applicability to modern RNNs, like Mamba and RWKV, sharing insights through both a research paper and a GitHub repository. This stimulates enhanced community engagement and shares the study's methodologies, encouraging collaborative RNN language model development.

Mysterious Claude 3 Opus' Size Spawns Speculation: The AI community is buzzing with questions about Claude 3 Opus' unrevealed model size, drawing stark contrasts with the transparency around the GPT-4 scale. Meanwhile, Google's Gemini project faces scrutiny for its conservative image generation policies and the controversial views of its project safety lead.

Benchmarking GPT-4 Turbo: Engineers are looking for reliable benchmarking information for OpenAI's latest models, particularly gpt-4-turbo. The absence of such data makes comparisons and progress evaluations challenging.

AI Governance Gets Legislative Attention: Generative AI Copyright Disclosure Act, introduced by Congressman Adam Schiff, emerges as a focal legislative effort aimed at enhancing transparency in AI's use of copyrighted material, setting the stage for potential regulatory impacts on the industry.

Emergence of Text Embeddings via LLM: A fresh engagement has surfaced around LLM2Vec, an endeavor that transforms decoder-only LLMs into encoders with claims of performance boosts, evoking debates about the fairness in comparison to other models and its practical utility.


OpenAI Discord


Latent Space Discord


HuggingFace Discord

Gemma 1.1 Instruct Outclasses Its Predecessor: Gemma 1.1 Instruct 7B shows promise over its previous version, now available on HuggingChat, and is prompting users to explore its capabilities. The model can be accessed here.

CodeGemma Steps into the Development Arena: A new tool for on-device code completion, CodeGemma, is introduced, available in models of 2B and 7B with 8192k context, and can be found alongside the recent non-transformer model RecurrentGemma here.

Cost-cutting Operations at HuggingFace: HuggingFace announces a 50% reduction in compute prices for Spaces and Inference endpoints, edging out AWS EC2 on-demand services in cost-effectiveness from April for these services.

Community Blog Makeover: A revamp of community blogs to "articles" with added features such as upvotes and enhanced visibility within HuggingFace is now in effect. Engage with the new articles format here.

Serverless GPUs Hit the Scenes with Bonus ML Content: Hugging Face showcases serverless GPU inference in collaboration with Cloudflare and furthers education with a new bonus unit on Classical AI in Games in its ML for Games Course. Investigate serverless GPU inference via this link, and explore the course's new content here.

Decoding Python for Debugging: Leverage eager execution in JAX or TensorFlow, use Python's breakpoint() function, and remove PyTorch implementations for effective debugging.

AI Watermark Eradicator Introduced: An AI tool designed to remove watermarks from images has been suggested, benefiting those with extensive batches of watermarked images. Review the tool on GitHub.

GPT-2's Summarization Struggles & Prompting Approach: A user's challenge with using GPT-2 for summarization could be a hint at the importance of prompts aligning with the model's training era, suggesting a possible need for updated instructions or newer models better suited for summarization.

Navigating CPU & GPU Challenges: Techniques like accumulation or checkpointing were discussed as workarounds for batch size limitations when using contrastive loss, acknowledging potential update issues with batchnorm. Tracking GPU usage via nvidia-smi became a point of interest for efficient resource management.

Diffuser Denoising Steps Illuminate Image Quality: Explorations into diffusers revealed that image quality fluctuates with changed denoising step counts. The ancestral sampler's role in quality variance was elaborated, and guidance for distributed multi-GPU inference was provided, particularly for handling significant memory requirements of models like MultiControlnet (SDXL).


OpenRouter (Alex Atallah) Discord


CUDA MODE Discord

Meta Morphs to Mega Sponsor: Meta reinforced its commitment to AI research with a massive sponsorship offering 4.2 million GPU hours for scaling laws research, facilitating a study on Language Model (LM) knowledge capacity, which is equivalent to nearly half a millennium of compute time. The full details can be found in the scaling laws study.

CUDA Takes Center Stage in LLM Training: A collaborative effort has been initiated to form a working group around CUDA-related projects, and enthusiasm around implementing algorithms in CUDA is growing, as seen with discussions on porting GPT-2 to CUDA llm.c repository.

Optimizing Matrix Multiplication: Performance gains in matrix multiplication are realized when respecting matrix shapes and memory layouts. An optimal matrix multiplication configuration using tiling has been reported as A: M=2047, K=N=2048 to avoid unaligned memory layouts, as elaborated in the blog post titled "What Shapes Do Matrix Multiplications Like?".

Quantization Quandaries in AI Models: The community engaged in vigorous discussions around the implementation of Half-Quadratic Quantization (HQQ) and the Marlin kernel's modest performance for matrix multiplication. Concerns were raised about quantization techniques affecting model perplexity, with HQQLinear's tuning under scrutiny and comparisons being drawn against GPTQ results.

Flash Attention and CUDA Expertise: Code for 'flash' versions of CUDA kernels underperformed initially but later experienced speed-ups through collaborative troubleshooting efforts to optimize execution. Meanwhile, the llm.c project emerged as a prime learning resource for those eager to strengthen their CUDA skills, with discussions touching on the utility of OpenMP and debugging of custom CUDA for performance gains.


LangChain AI Discord

Whisper’s Not Speaking, It's Listening: Whisper is clarified to be a speech-to-text model and is not inherently supported by Ollama, yet can be utilized locally or with alternate backends from the same developer.

LangChain’s Limitations and Applications: LangChain may not offer significant benefits over OpenAI's API for simple AI assistant tasks but shines in scenarios requiring integrations beyond OpenAI's scope, with practical use cases like RAG performance evaluations.

TinderGPT Swipes Right on Automation: A new app, TinderGPT, has been created to automate Tinder conversations and secure dates, inviting contributions on its GitHub.

Comparing LLMs via Structured Output: An analysis was shared comparing structured output performance across a variety of large language models, both open and closed source, detailed on this GitHub page.

AI on the Fashion Frontline: A video demonstrating an AI agent that can simulate virtual clothing trials was shared, aiming to revolutionize the e-commerce space for fashion – catch the demo here.


LlamaIndex Discord


LAION Discord

Pixart Sigma's Speedy Rendering Meets Quality Quirks: Pixart Sigma demonstrated impressive prompt execution times of 8.26 seconds on a 3090 but faced criticism for "mangled" output images, hinting at issues with open models' quality control.

Mistral's Might Multiplying: The release of Mistral 22b x 8 sparked excitement, with community interest in its capabilities compared to mistral-large. A magnet link for downloading mixtral-8x22b was shared without further description.

Questioning the Echo Chamber in AI: A recent paper challenges the expected "zero-shot" generalization in multimodal models like CLIP and highlights the dependence of performance on data seen during pretraining.

Google's Griffin Grabs Attention: Google's introduction of the Griffin model architecture adds a significant 1 billion parameters, promising enhanced performance, according to a Reddit discussion.

Direct Nash Optimization Outperforms RLHF: A new study poses a sophisticated alternative to Reinforcement Learning from Human Feedback (RLHF) for large language models, employing "pair-wise" optimization and purportedly achieving notable results even with a 7 billion parameter model.


OpenInterpreter Discord


Interconnects (Nathan Lambert) Discord

Google's RL Surprise: Google rolled out Griffin, a 2-billion-parameter recurrent linear attention model, marking a significant leap from its precursor, CodeGemma. The Griffin model's architecture draws parallels with RWKV, as detailed in their research paper on arXiv.

Rethinking RLHF Efficacy: A new discussion focused on improving large language models post-training with iterative feedback, potentially rivaling traditional RLHF methods. Concern was raised regarding the effectiveness of Rejection Sampling and the emphasis on benchmarks during model optimization, reflecting a desire for more practical development approaches found in a recent paper.

The Forecast for LLMs: Revealing 12 scaling laws for LLMs, a new study backed by Meta dedicates 4,200,000 GPU hours to unpacking knowledge capacity. Intriguingly, int8 quantization maintains knowledge capacity effectively, a pivotal finding for both resource efficiency and the potential application of Mixture of Experts (MoE) models.

Buzz Around Mixtral: Mixtral, a fresh player in the model scene, stirs conversations with its differentiation from Mistral and Miqu. A surge in model releases, including anticipation for the likes of llama 3 smol and Cohere, suggests a competitive acceleration in AI development, as discussed in a Twitter thread here.

Benchmarks: A Temporary Yardstick: While there's consensus that optimizing for benchmarks such as alpacaeval may not correlate with true model superiority, they retain utility as an interim indicator of progress. Developers are advocating for post-equilibrium approaches with a focus on improving data and scaling rather than chasing scores


tinygrad (George Hotz) Discord


OpenAccess AI Collective (axolotl) Discord

Mixtral 8x22B Raises Eyebrows: The community engaged in discussions on the new Mixtral 8x22B model, which has around 140 billion parameters and operates at rank32 with an unexpectedly low loss; though it's unclear yet if this model is instruction tuned or a base model. There was keen interest in quantization techniques to make larger models like Mixtral 8x22B manageable for developers, indicating a need to balance model size against resource constraints.

PiSSA Promises Precise Performance: A novel LoRA layer initialization technique known as PiSSA, which uses the SVD of the original weight matrix, has been shared for potential better fine-tuning outcomes, detailed in an arXiv abstract and a GitHub repository.

Dataset Dilemma and Dedication: Members are actively seeking and sharing datasets, like the Agent-FLAN dataset, useful for function-calling and JSON parsing, to tune large language models effectively. Another member discussed pre-training a model with a Norwegian arts dataset to enhance its grammar capabilities and received advice on the representation format of the data.

Model Hosting Hurdle: A contributor quickly responded to the new Mixtral-8x22B model by uploading it to Hugging Face, demonstrating the community's rapid contribution culture. Meanwhile, questions about hardware capability for the mixtral-qlora-fsdp model on a dual 24GB GPU setup and the search for a web self-hostable frontend compatible with various AI APIs remained unanswered.

Samsung Sets the Stage: Samsung announced the Samsung Next 2024 Generative AI Hackathon for May 11th in New York, which will explore tracks in Health & Wellness and Mediatech, detailed at Samsung Next AI Hackathon.


Modular (Mojo 🔥) Discord

Cpp Oldies But Goodies in Mojo Land: While Mojo developers are on the lookout for Python-style f strings, they're currently making do with C-style formatting by importing _printf as printf, but with a heads-up that this feature might not stick around forever.

Mojo API Guide Just a Click Away: A member shared a Notion page translating API documentation into beginner-friendly summaries, giving new Mojo users a leg up.

Mojo's Concurrency Conundrums: Mojo's async/await and coroutines implementation is ongoing, differing from Python's; details are clarified in the Mojo docs, but async for and async with are missing as per the roadmap.

Vexing Variadic Generics: A burst of community bewilderment was sparked by the mention of "Heterogeneous variadic generics," a term that encapsulates the complexity of advanced type systems in programming languages.

Mojo UI Quest for a Native Look: Active development on the Mojo-UI project ignites discussion on integration with Objective-C and accessing the AppKit framework. Ambitious integration aims may require a special binding layer, as followed on GitHub.


DiscoResearch Discord


LLM Perf Enthusiasts AI Discord


Datasette - LLM (@SimonW) Discord


Skunkworks AI Discord


Mozilla AI Discord


Alignment Lab AI Discord


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

Stability.ai (Stable Diffusion) ▷ #general-chat (985 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #💬-general (228 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (223 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (4 messages):


LM Studio ▷ #🎛-hardware-discussion (85 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (68 messages🔥🔥):

Links mentioned:


LM Studio ▷ #autogen (5 messages):


LM Studio ▷ #amd-rocm-tech-preview (23 messages🔥):

Links mentioned:


LM Studio ▷ #crew-ai (3 messages):


LM Studio ▷ #model-announcements (1 messages):

Link mentioned: lmstudio-community (LM Studio Community): no description found


Unsloth AI (Daniel Han) ▷ #general (411 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #announcements (1 messages):

Link mentioned: Unsloth - 4x longer context windows & 1.7x larger batch sizes: Unsloth now supports finetuning of LLMs with very long context windows, up to 228K (Hugging Face + Flash Attention 2 does 58K so 4x longer) on H100 and 56K (HF + FA2 does 14K) on RTX 4090. We managed...


Unsloth AI (Daniel Han) ▷ #random (9 messages🔥):


Unsloth AI (Daniel Han) ▷ #help (144 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (12 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (43 messages🔥):

Links mentioned:


Perplexity AI ▷ #general (551 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (14 messages🔥):


Perplexity AI ▷ #pplx-api (15 messages🔥):


Nous Research AI ▷ #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=Gb--4supXoo


Nous Research AI ▷ #interesting-links (14 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (308 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (50 messages🔥):

Link mentioned: Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data: The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investi...


Nous Research AI ▷ #bittensor-finetune-subnet (1 messages):

4biddden: Is there a runpod template available for the bittensor fine-tune?


Nous Research AI ▷ #world-sim (93 messages🔥🔥):

Links mentioned:


Eleuther ▷ #announcements (1 messages):

Links mentioned:


Eleuther ▷ #general (250 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (203 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (4 messages):

Link mentioned: Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws: Scaling laws describe the relationship between the size of language models and their capabilities. Unlike prior studies that evaluate a model's capability via loss or benchmarks, we estimate the n...


Eleuther ▷ #interpretability-general (1 messages):

norabelrose: https://arxiv.org/abs/2404.05971


Eleuther ▷ #lm-thunderdome (8 messages🔥):

Links mentioned:


OpenAI ▷ #ai-discussions (68 messages🔥🔥):

Link mentioned: OpenAI Status: no description found


OpenAI ▷ #gpt-4-discussions (33 messages🔥):

<ul>
  <li><strong>Domain Verification Troubles</strong>: A user encountered an error when trying to publish a GPT, asking for suggestions on how to verify a domain even after setting up the TXT records.</li>
  <li><strong>GPT to SaaS Transformation Inquiry</strong>: One member was seeking advice on services available to convert GPT into a single-purpose SaaS application, aiming to create a proof of concept for future endeavors.</li>
  <li><strong>Technical Difficulties with GPT</strong>: Several members reported experiencing issues ranging from inability to load GPT, mentions not functioning, to suspended API access due to billing problems despite sufficient funds.</li>
  <li><strong>Chatbot Outage Reports</strong>: Users were facing outages with GPT, signalizing errors such as "GPT inaccessible or not found" and having trouble retrieving existing conversations.</li>
  <li><strong>Service Status Updates and Confirmation</strong>: A link to <a href="https://status.openai.com/">OpenAI's service status page</a> was shared, confirming the ongoing investigation into elevated errors and intermittent outages affecting ChatGPT services.</li>
</ul>

Link mentioned: OpenAI Status: no description found


OpenAI ▷ #prompt-engineering (179 messages🔥🔥):


OpenAI ▷ #api-discussions (179 messages🔥🔥):


Latent Space ▷ #ai-general-chat (141 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (3 messages):

Links mentioned:


Latent Space ▷ #llm-paper-club-west (268 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #announcements (4 messages):

Links mentioned:


HuggingFace ▷ #general (105 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (2 messages):

Links mentioned:


HuggingFace ▷ #cool-finds (7 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (12 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (8 messages🔥):


HuggingFace ▷ #computer-vision (15 messages🔥):

Link mentioned: GitHub - Firdavs-coder/Aladdin-Persson-AI-Watermark-Destroy: Aladdin-Persson-AI-Watermark-Destroy Public: Aladdin-Persson-AI-Watermark-Destroy Public. Contribute to Firdavs-coder/Aladdin-Persson-AI-Watermark-Destroy development by creating an account on GitHub.


HuggingFace ▷ #NLP (7 messages):


HuggingFace ▷ #diffusion-discussions (18 messages🔥):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (4 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (1 messages):

stonedjesusape: Fuck


OpenRouter (Alex Atallah) ▷ #general (166 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #general (5 messages):

Links mentioned:


CUDA MODE ▷ #cuda (1 messages):


CUDA MODE ▷ #torch (1 messages):

Link mentioned: Answer Key: What Shapes Do Matrix Multiplications Like?: Companion to https://www.thonking.ai/p/what-shapes-do-matrix-multiplications


CUDA MODE ▷ #beginner (1 messages):


CUDA MODE ▷ #ring-attention (7 messages):

Links mentioned:


CUDA MODE ▷ #off-topic (4 messages):


CUDA MODE ▷ #triton-puzzles (5 messages):

Link mentioned: minor on puzzle 11 by ZhaoyueCheng · Pull Request #10 · srush/Triton-Puzzles: fix formula on puzzle 11 to sum over dimension L add B_MID on puzzle 11 for the parameter on block size to loop over the MID dimension


CUDA MODE ▷ #hqq (96 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #triton-viz (1 messages):

kerenzhou: I like the corresonding code on the figure


CUDA MODE ▷ #llmdotc (42 messages🔥):

Links mentioned:


LangChain AI ▷ #general (108 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #langchain-templates (1 messages):

lhc1921: https://python.langchain.com/docs/integrations/llms/azure_openai/


LangChain AI ▷ #share-your-work (3 messages):

Links mentioned:


LangChain AI ▷ #tutorials (3 messages):

Link mentioned: Future of E-commerce?! Virtual clothing try-on agent: I built an agent system which will autonomously iterate & generate img of AI model wearing certain cloth and produce millions+ social postsFree access to run...


LlamaIndex ▷ #blog (4 messages):


LlamaIndex ▷ #general (104 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (2 messages):


LAION ▷ #general (87 messages🔥🔥):

Links mentioned:


LAION ▷ #research (9 messages🔥):

Links mentioned:


OpenInterpreter ▷ #general (51 messages🔥):


OpenInterpreter ▷ #O1 (38 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (45 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (14 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (7 messages):


Interconnects (Nathan Lambert) ▷ #memes (2 messages):


Interconnects (Nathan Lambert) ▷ #rlhf (10 messages🔥):

Link mentioned: Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences: This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical approach for post-training L...


Interconnects (Nathan Lambert) ▷ #reads (3 messages):

Links mentioned:


tinygrad (George Hotz) ▷ #general (52 messages🔥):

Links mentioned:


tinygrad (George Hotz) ▷ #learn-tinygrad (13 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (40 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (4 messages):

Link mentioned: Tweet from Charles Foster (@CFGeek): YES! If you initialize a LoRA layer based on the SVD of the original weight matrix (with its top singular values & vectors), you get significantly better fine-tuning results. This is a straight-up fr...


OpenAccess AI Collective (axolotl) ▷ #general-help (9 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #datasets (2 messages):

Link mentioned: internlm/Agent-FLAN · Datasets at Hugging Face: no description found


Modular (Mojo 🔥) ▷ #general (8 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):


Modular (Mojo 🔥) ▷ #🔥mojo (32 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-projects (4 messages):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-blogs-vids (1 messages):


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (2 messages):


Modular (Mojo 🔥) ▷ #📰︱newsletter (1 messages):

Zapier: Modverse Weekly - Issue 29 https://www.modular.com/newsletters/modverse-weekly-29


Modular (Mojo 🔥) ▷ #nightly (2 messages):


DiscoResearch ▷ #mixtral_implementation (5 messages):

Links mentioned:


DiscoResearch ▷ #general (18 messages🔥):

Links mentioned:


DiscoResearch ▷ #discolm_german (24 messages🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #gpt4 (16 messages🔥):

Link mentioned: Google Colaboratory: no description found


Datasette - LLM (@SimonW) ▷ #llm (15 messages🔥):


Skunkworks AI ▷ #general (3 messages):

Link mentioned: Tweet from Jan P. Harries (@jphme): @MistralAI first AGIEval results look great 👇 - thanks for releasing this beast, guys! 👏 https://x.com/jphme/status/1778028110954295486 ↘️ Quoting Jan P. Harries (@jphme) First AGIEval results fo...


Skunkworks AI ▷ #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=Gb--4supXoo


Mozilla AI ▷ #llamafile (4 messages):

Link mentioned: ollama/llm/server.go at c5c451ca3bde83e75a2a98ed9fd4e63a56bb02a9 · ollama/ollama: Get up and running with Llama 2, Mistral, Gemma, and other large language models. - ollama/ollama


Alignment Lab AI ▷ #general-chat (2 messages):

Link mentioned: SynthTrails: no description found