As is their established pattern, Mistral followed up their magnet link with a blogpost, and an instruct-tuned version of their 8x22B model:
the image ended up sparking some friendly competition between Databricks, Google, and AI21, all of which merely emphasized that Mixtral created a new tradeoff between active params and MMLU performance:
Of course, what is unsaid that the active params count doesnt linearly correlate with cost to run dense models, and that singular focus on MMLU isnāt ideal for less scrupulous competitors.
Table of Contents
[TOC]
AI Reddit Recap
Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/Singularity. Comment crawling works now but has lots to improve!
AI Investments & Advancements
-
Massive AI investments from tech giants: In /r/singularity, DeepMind CEO reveals Google plans to invest over $100 billion in AI, with other tech giants like Microsoft, Intel, SoftBank, and an Abu Dhabi fund making similarly huge bets, indicating high confidence in AIās potential.
-
UK criminalizes non-consensual deepfake porn: The UK has made it a crime to create sexually explicit deepfake images without consent. In /r/technology, commenters debate the implications and enforcement challenges.
-
Nvidiaās AI chip dominance: In /r/hardware, a former Nvidia employee claims on Twitter that no one will catch up to Nvidiaās AI chip lead this decade, sparking discussion about the companyās strong position.
AI Assistants & Applications
-
Potential billion-dollar market for AI companions: In /r/singularity, a tech executive predicts AI girlfriends could become a $1 billion business. Commenters suggest this is a vast underestimate and discuss the societal implications.
-
Unlimited context length for language models: A tweet posted in /r/artificial announces unlimited context length, a significant advancement for AI language models.
-
AI surpassing humans on basic tasks: In /r/artificial, a Nature article reports that AI has surpassed human performance on several basic tasks, though still trails on more complex ones.
AI Models & Architectures
- Zamba: Novel 7B parameter hybrid architecture: In /r/LocalLLaMA, Zyphra unveils Zamba, a 7B parameter hybrid architecture combining Mamba blocks with shared attention. It outperforms models like LLaMA-2 7B and OLMo-7B despite less training data. The model was developed by a team of 7 using 128 H100 GPUs over 30 days.
AI Twitter Recap
all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.
Mixtral 8x22B Instruct Model Release
- Impressive Performance: @GuillaumeLample announced the release of Mixtral 8x22B Instruct, which significantly outperforms existing open models using only 39B active parameters during inference, making it faster than 70B models.
- Multilingual Capabilities: @osanseviero highlighted that Mixtral 8x22B is fluent in 5 languages (English, French, Italian, German, Spanish), has math and code capabilities, and a 64k context window.
- Availability: The model is available on the @huggingface Hub under an Apache 2.0 license and can be downloaded and run locally, as confirmed by @_philschmid.
RAG (Retrieval-Augmented Generation) Advancements
- GroundX for Improved Accuracy: @svpino shared that @eyelevelai released GroundX, an advanced RAG API. In tests on 1,000 pages of tax documents, GroundX achieved 98% accuracy compared to 64% for LangChain and 45% for LlamaIndex.
- Importance of Assessing Risks: @omarsar0 emphasized the need to assess risks when using LLMs with contextual information that may contain supporting, contradicting, or incorrect data, based on a paper on RAG model faithfulness.
- LangChain RAG Tutorials: @LangChainAI released a playlist explaining RAG fundamentals and advanced methods on @freeCodeCamp. They also shared a @llama_index tutorial on using Mixtral 8x22B for RAG.
Snowflake Arctic Embed Models
- Powerful Embedding Models: @SnowflakeDB open-sourced their Arctic family of embedding models on @huggingface, which are the result of @Neevaās search expertise and Snowflakeās AI commitment, as noted by @RamaswmySridhar.
- Efficiency and Performance: @rohanpaul_ai highlighted the efficiency of these models, with parameter counts from 23M to 335M, sequence lengths from 512 to 8192, and support for up to 2048 tokens without RPE or 8192 with RPE.
- LangChain Integration: @LangChainAI announced same-day support for using Snowflake Arctic Embed models with their @huggingface Embeddings connector.
Misc
- CodeQwen1.5 Release: @huybery introduced CodeQwen1.5-7B and CodeQwen1.5-7B-Chat, specialized codeLLMs pretrained with 3T tokens of code data. They exhibit exceptional code generation, long-context modeling (64K), code editing, and SQL capabilities, surpassing ChatGPT-3.5 in SWE-Bench.
- Boston Dynamicsā New Robot: @DrJimFan shared a video of Boston Dynamicsā new robot, arguing that humanoid robots will exceed iPhone supply in the next decade and that āhuman-levelā is just an artificial ceiling.
- Superhuman AI from Day One: @ylecun stated that AI assistants need human-like intelligence plus superhuman abilities from the start, requiring understanding of the physical world, persistent memory, reasoning and hierarchical planning.
AI Discord Recap
A summary of Summaries of Summaries
Stable Diffusion 3 and Stable Diffusion 3 Turbo Launches:
- Stability AI introduced Stable Diffusion 3 and its faster variant Stable Diffusion 3 Turbo, claiming superior performance over DALL-E 3 and Midjourney v6. The models use the new Multimodal Diffusion Transformer (MMDiT) architecture.
- Plans to release SD3 weights for self-hosting with a Stability AI Membership, continuing their open generative AI approach.
- Community awaits licensing clarification on personal vs commercial use of SD3.
Unsloth AI Developments:
- Discussions on GPT-4 as a fine-tuned iteration over GPT-3.5, and the impressive multilingual capabilities of Mistral7B.
- Excitement around the open-source release of Mixtral 8x22B under Apache 2.0, with strengths in multilingual fluency and long context windows.
- Interest in contributing to Unsloth AIās documentation and considering donations to support its development.
WizardLM-2 Unveiling and Subsequent Takedown:
-
Microsoft announced the WizardLM-2 family, including 8x22B, 70B, and 7B models, demonstrating competitive performance.
-
However, WizardLM-2 was unpublished due to lack of compliance review, not toxicity concerns as initially speculated.
-
Confusion and discussions around the takedown, with some users expressing interest in obtaining the original version.
-
Stable Diffusion 3 Launches with Improved Performance: Stability AI has released Stable Diffusion 3 and Stable Diffusion 3 Turbo, now available on their Developer Platform API, boasting the fastest and most reliable performance. The community awaits clarification on the Stability AI Membership model for self-hosting SD3 weights. Meanwhile, SDXL finetunes have made SDXL refiners nearly obsolete, and users discuss model merging challenges in ComfyUI and limitations of the diffusers pipeline.
-
WizardLM-2 Debuts Amidst Excitement and Uncertainty: The release of WizardLM-2 models by Microsoft has sparked enthusiasm for their potential GPT-4-like capabilities in an open-source format. However, the sudden takedown of the models due to a missed compliance review has led to confusion and speculation. Users compare the performance of WizardLM-2 variants and share tips for resolving compatibility issues in LM Studio.
-
Multimodal Models Advance with Idefics2 and Reka Core: Hugging Faceās Idefics2 8B and Reka Core have emerged as powerful multimodal language models, showcasing impressive capabilities in visual question answering, document retrieval, and coding. The upcoming chat-focused variant of Idefics2 and Reka Coreās competitive performance against industry giants have generated significant interest. Discussions also revolve around the cost-efficiency of models like JetMoE-8B and the launch of Snowflakeās Arctic embed family for text-embedding.
Other notable topics include:
- The introduction of ALERT, a safety benchmark for assessing Large Language Models, and debates around AI safety standards.
- Explorations of Retrieval Augmented Generation (RAG) for vision-based applications and the philosophical implications of AI simulations in World-Sim.
- The rise of AI-human collaboration platforms like Payman AI and the integration of AI inference in Supabaseās edge functions.
- Challenges to the Chinchilla scaling laws and discussions on the expressive power of state-space models in the research community.
- Advancements in PEFT methods like Dora and RSLoRA, and the pursuit of multilingual model expansion using Mixture-of-Experts (MoE) approaches.
PART 1: High level Discord summaries
Stability.ai (Stable Diffusion) Discord
Stable Diffusion 3 Turbo Charges the Scene: Stability AI has introduced Stable Diffusion 3 and Stable Diffusion 3 Turbo, now available on their Developer Platform API, with claims of the fastest and most reliable performance, supported by Fireworks AI. Interested parties can get started with SD3 at Stable Diffusion 3 & Developer API, and an open generative AI approach is promised with plans for the model weights to be available for self-hosting for members.
Refining Visually Intuitive Generative AI: The SDXL finetunes have made the use of SDXL refiners nearly obsolete, as they are now prevalent in Civitai downloads, suggesting a trend towards integrated finetunes over separate refiner modules, reflecting a community-driven optimization.
Model Merging Explored: There is lively discussion on model merging tactics within ComfyUI, grappling with complex mechanisms such as V-prediction and epsilon, highlighting the communityās experimentation with these methods to achieve enhanced outcomes, yet acknowledging that correct implementations are crucial to prevent unpredictable results.
Navigating Diffusers Library Limitations: A conversation emerged around the limitations and dependencies in the diffusers pipeline, with a focus on Stable Video Diffusion Pipeline challenges. Despite these challenges, some users are optimizing usage by running models independently post-download, bypassing certain Hugging Face library constraints.
Awaiting SD3ās Membership Model Details: The community is keenly waiting for Stability AI to provide clarifications on Stable Diffusion 3 licensing for personal versus commercial use, especially in light of the new membership model revealed for accessing self-hosted weights.
Unsloth AI (Daniel Han) Discord
GPT-4 Gains Over GPT-3.5: The new iteration of GPT, GPT-4, is regarded as a fine-tuned enhancement over GPT-3.5, though specifics on performance metrics or features were not provided.
Mistral7B Shines in Multilingualism: Members conferred about the multilingual capabilities of the Mistral7B model, recommending the inclusion of diverse language data in training sets, particularly French, to improve performance.
Unsloth AI Gets Help from Fans: Thereās a tangibly positive response from the community towards Unsloth AI, with users keen to help with documentation, expansion, and even considering donations. The Mixtral 8x22B modelās release under Apache 2.0 was met with excitement for its promise in multilingual fluency and handling of extensive context windows.
Chroma Goes Go: The Chroma project leaps forward with an edge version written in Go, which utilizes SQLite and WASM for browser-based applications, now available on GitHub.
Mobile AI Deployment Discussed: The complexity of deploying AI models on mobile devices surfaced, noting challenges such as the absence of CUDA and the infeasibility of running standard Deep Learning Python codes on such platforms.
LM Studio Discord
AI Assistance for NeoScript Programming: A user looking for help with NeoScript programming expressed challenges in configuring AI models. Microsoftās new release, WaveCoder Ultra 6.7b, excels in code translation and could be a strong candidate for this task.
Solving AIās Echo Chamber: To combat repetitive AI responses, particularly in Dolphin 2 Mistral, members discussed strategies such as fine-tuning models and leveraging multi-turn conversation frameworks outlined in Azureās article.
Introducing the WizardLM-2 League: The debut of WizardLM-2 models sparked discussions about performance. Compatibility with existing tools, including the importance of using GGUF quants and version 0.2.19 or newer for proper functionality, was emphasized.
Tech Wizards at Play: One user successfully enabled direct communication between four 3090 GPUs, improving model performance by bypassing CPU/RAM. There was also chatter about the challenges of signing Windows executables, with a hint that the Windows versions are indeed signed with an Authenticode cert.
Quantization Conundrum and Model Preferences: Mixed reviews on quantization levels, from Q8 to Q6K, pointed to a preference for models with higher quantization levels when VRAM is sufficient. For large models, such as WizardLM-2-8x22B, GPUs like the 4090 with 24GB VRAM may be inadequate.
Nous Research AI Discord
-
Multimodal Models Stepping Up: Exciting advancements in multimodal language models are showcased, with Hugging Faceās Idefics2 8B and Reka Core emerging as key players, evident from Open Multimodal ChatGPT video and Reka Core overview. The GPT4v/Geminipro Vision and Claude Sonnet models are recommended for vision-RAG applications.
-
LLMs Tuning into Self-Optimization: New techniques for enhancing Instruct Model LLMs look promising, with models able to select the best solution by reconstructing inputs from outputs, detailed in a Google Slideshow on aligning LLMs for medical reasoning.
-
WizardLM Disappearance Sparks Debate: Thereās uncertainty around WizardLMās sudden takedown; while some speculated on toxicity issues, confirmed reports attributed it to lack of a compliance review as shared in a comprehensive WizardLM information bundle.
-
LLMs Performance: A Roller Coaster of Expectations: Engineers discuss CodeQwen1.5-7B Chatās impressive benchmarking and debate on architectures and tuningās impact on performance. Furthermore, upcoming models like Hermes 8x22B are eagerly awaited, with concerns on whether they can be accommodated by personal equipment setups.
-
World-Simās Return Triggers AI Philosophical Debates: As World-Sim gears up for a return, enthusiasts burst with anticipation, pondering the philosophical aspects and implications of such simulated worlds. Official confirmation sent excitement soaring with a Websim link provided for those eager to jump in.
Perplexity AI Discord
Robots Debating Their Roots: Engineers exchanged insights on the performance nuances of AI models including GPT-4 and Claude 3 Opus, with a shared sentiment that GPT-4 may exhibit ālazyā tendencies in real-world applications. The open-source Mixtralās 8x22B model is highlighted for its impressive capabilities, sparking debates on model efficacy.
Stumped by Stubborn Software Issues: A conversation was noted about achieving consistency between the web client and the API, with specific attention to parameters like temperature settings. Engineers are also discussing the benefits of including a rate limit counter in the API response for better management and transparency.
The Vanishing Messages Mystery: Concern was voiced over changes in the Perplexity APIās payment method management, particularly the opacity surrounding the remaining message counts for pro users. This focus on transparency indicates professionals need clarity to manage resources efficiently.
A Tale of Truncated Tokens: Technical dialogue included challenges faced when engaging models with large context sizes, like a 42k token prompt, and the tendency for models to summarize rather than dive deep into lengthy documents. This could be pivotal as engineers optimize models to process complex prompts fully.
The Search for Smarter Searches: Members also discussed using site:URL
search operators for more targeted information retrieval. Additionally, there is a call for better communication regarding rate limits in the API, including the possibility of a 429 response
.
LAION Discord
-
PyTorchās Abstraction Puzzle: Engineers are grappling with PyTorchās philosophy of abstracting complexities, which, while simplifying coding, often leaves them puzzled when troubleshooting unexpected results.
-
Handling Hefty Datasets with Zarr: Thereās active exploration on utilizing zarr to manage a hefty 150 GB MRI dataset, with discussions circling around its efficiency and whether it will overload RAM with large data loads.
-
Legal Lines Drawn for Deepfakes in the UK: Members are discussing the implications of UK legislation targeting the creation of distressing images, questioning its enforceability given the blurriness of proving intent.
-
AI Inference Fine-Tuning Talks: Voices from the community are calling for clarity on AI modelsā inference settings, like controlling CFG or integrating models with robust ODE solvers, beyond just defaulting to Eulerās method.
-
Cascade Teamās Corporate Shuffle: Thereās speculation about the future of Stability AIās Cascade team after their departure and the dissolution of their Discord channel, with wonderment if thereās a link to a new venture, possibly Leonardo, or an ongoing affiliation with SAI.
-
ALERT! A New Safety Benchmark for LLMs: The introduction of ALERT, a safety benchmark for assessing Large Language Models, has sparked interest, providing a Dataset of Problematic Outputs (DPO) for community evaluation, available on GitHub.
-
AI Audio-Visual Harmony: An Arxiv paper presents methods for generating audio from text, improving performance by zeroing in on concepts or events, stirring dialogue in the research community.
-
AI Safe or Stifled?: The AI safety debate is heated, with some pushing back against confining AI strictly to PG content, arguing it could crimp its creative spark compared to other artistic mediums.
-
GANs vs. Diffusion Models: Speed or Aesthetics?: Discussions are heating up over the advantages of GANsānotably, their faster inference and lesser parameter countāversus diffusion models, even as GANs face criticism for image quality and training challenges.
OpenRouter (Alex Atallah) Discord
OpenRouter Welcomes WizardLM Raptors: OpenRouter announced the release of WizardLM-2 7B and a price drop for WizardLM-2 8x22B to $0.65/M tokens. The WizardLM-2 8x22B Nitro boasts over 100 transactions per second post its database restart.
Latency Labyrinth Resolved: Latency issues on various models such as Mistral 7B Instruct and Mixtral 8x7B Instruct were attributed to cloud provider DDoS protection, with updates concerning the resolution found in the associated discussion thread.
Calling All Frontend Mavericks: A member seeks web development assistance for an AI-based frontend project for OpenRouter, specifically emphasizing role-playing novel mode and conversation style systems. Ability to distinguish AI-generated text from user input is also requested.
AI Model Morality and Multilingual Mastery: Vigorous exchanges regarding both censorship protocols for NSFW content and the imperative for enhancing modelsā multilingual performance took place. Members looked forward to direct endpoints and new provider integrations for an anticipated AI model release.
Bitrate Bits and Quality Quibbles: Users showed a clear preference for a minimum of 5 bits per word (bpw) for model quantization, noting that reductions below this threshold notably compromise quality. Discussions underscored the trade-offs between efficient operation and maintaining high fidelity in AI outputs.
Modular (Mojo š„) Discord
-
Mojo to Python Conversion Now a Possibility: Engineers discuss the new package mojo2py, capable of converting Mojo code to Python, and chatted about the desire for more learning resources, pointing to the Mojo programming manual for beginners.
-
Maxim Zaks Debates the Mojo āHypeā: A PyCon Lithuania talk titled āIs Mojo just a hype?ā by Maxim Zaks was highlighted, provoking debate on the chatbotās industry impact, available in a video.
-
Mojoās Inherent Nightly Nuances: Users are navigating through the challenges of a new nightly Mojo release, noting unconventional code styling for readability, desires for comprehensive tutorials on traits, and a recent pull request reflecting significant updates.
-
Optimizing with Compile-Time Aliases: Discussion thrived around optimizing alias memory usage in Mojo, hinted by the recommendation of readable code over extensive commenting from a cited YouTube video.
-
Community Mojo Projects Surge: Community contributions soared with a shared Mojo āsketchā found at this gist and a request about implementing the Canny edge recognition algorithm in Mojo, coupled with directions to Mojoās documentation and tooling resources.
CUDA MODE Discord
PyTorch Resource Debate: While discussing if āDeep Learning with PyTorchā is a relevant resource despite being 4 years old, members noted that the PyTorch core has remained stable, though significant updates have occurred in the compiler and distributed systems. A member shared a teaser for an upcoming edition of the book, which would include coverage of transformers and Large Language Models.
CUDA Custom GEMM Sparking Interest: The conversation involved improving GEMM performance in CUDA, with one member providing a new implementation that outperformed PyTorchās function on specific benchmarks, sharing their code on GitHub. However, another highlighted JIT compilation issues with torch.compile
. The group also discussed optimal block size parameters, referencing a related code example on Gist.
Next-Gen Video Analysis & Robotics Gains Screenshare: Members shared links about Augmendās video processing features, which combine OCR and image segmentation, previewed on wip.augmend.us, and the full service to be hosted on augmend.com. Another highlight was Boston Dynamicsā unveiling of a fully electric robot named Atlas intending for real-world applications, showcased in their All New Atlas | Boston Dynamics video.
Bridging the CUDA Toolkit Knowledge Gap: In the #beginner channel, members discussed issues related to using the CUDA toolkit on WSL, with one user facing problems running the ncu profiler. The community provided troubleshooting steps and stressed the importance of setting the correct CUDA path in environment variables. There was also an advisory that Windows 11 might be necessary for effective CUDA profiling on WSL 2, with one user providing a guide on the subject.
Quantization Dilemmas and Solutions in Air: A thorough chat occurred on the topic of quantization axes in GPT models with a highlight on the complexities when using axis=0
. Participants suggested quantizing Q, K, and V separately with references to Triton kernels and an autograd optimization method for boosting speed and performance. Their debate continued with discussions of 2/3 bits quantization practicality and was supplemented with implementation details and benchmarks on GitHub.
Optimizing ML Model Performance: A GitHub notebook for extending PyTorch with CUDA Python garnered attention for speed enhancements but with a need for more optimization to fully tap into tensor core capabilities, as shared in the notebookās link. Additionally, there were mentions of optimizing the softmax function and block sizes for cache utilization, with insights shared through a GitHub pull request.
OpenAI Discord
Multiplayer GPT Headed for the Gaming Galaxy: Engineers discussed the potential of integrating GPT-Vision and camera inputs for a real-time gaming assistant to tackle multiple-choice games. The possibility of utilizing Azure or virtual machines to handle intensive computational tasks was raised, alongside leveraging TensorFlow or OpenCV for system management.
AI Versus Human Conundrum Continues: A philosophical debate emerged concerning the differences between AI and human cognition, discussing the prospects of AI acquiring human-like reasoning and emotions, and the role of quantum computing in this evolution.
The Quest for Knowledge Enhancements: Members sought information on how to prepare a knowledge base for custom GPT applications and questioned the arrival of the Whisper v3 API. The noted limitations such as GPT-4ās token memory span being speculated to have shrunk triggered calls for improved clarity on API capabilities.
Creative Minds Favor Claude and Gemini: When tackling literature reviews and fictional works, AI aficionados recommended using models like Claude and Gemini 1.5. These tools were favored for their prowess in handling literary tasks and creative writing respectively.
Discord Channel Dynamics: Two channels, prompt-engineering and api-discussions, experienced a notable decrease in activity, with participants attributing the quiet to possible over-moderation and a recent string of timeouts, including a specific 5-month timeout case involving assistance to another user.
LlamaIndex Discord
-
Hybrid Cloud Hustle with Qdrant: Qdrantās new hybrid cloud offering allows for running their service across various environments while maintaining control over data. They backed their launch with a thorough tutorial on the setup process.
-
LlamaIndex Beefs Up with Azure AI Search: LlamaIndex teams up with Azure AI Search for advanced RAG applications, featuring a tutorial by Khye Wei that illustrates Hybrid Search and Query rewriting capabilities.
-
MistralAI Model Immediately Indexed: LlamaIndex has instant support for MistralAIās newly released 8x22b model, paired with a Mistral cookbook focusing on intelligent query routing and tool usage.
-
Building and Debugging in LlamaIndex: AI engineers discussed best practices for constructing search engines in LlamaIndex, resolving API key authentication errors, and navigating through updates and bug fixes, including a specific
BaseComponent
error with a GitHub solution. -
Hierarchical Structure Strategy Session: Inquiry within the ai-discussion channel about constructing a hierarchical document structure using ParentDocumentRetriever, with LlamaIndex as the framework of choice.
Eleuther Discord
-
Peering into the Future of Long-Sequence Models: Feedback Attention Memory (FAM), discussed in recent conversations, proposes a solution to the quadratic attention problem of Transformers, enabling processing of indefinitely long sequences and showing improvement on long-context tasks. Rekaās new encoder-decoder model is touted to support sequences up to 128k, as detailed in their core tech report.
-
Precision in Scaling Laws and Evaluation: Questions on compute-optimal scaling laws by Hoffman et al. (2022) led to an exploration of the credibility of narrow confidence intervals without extensive experiments as detailed in Chinchilla Scaling: A replication attempt. Moreover, accurate cost estimations within ML papers are hindered when the size of datasets like that in the SoundStream paper is omitted, bringing to light the necessity of transparent data reporting.
-
Unpacking Model Evaluation Techniques: In Eleutherās
#lm-thunderdome
, the usage oflm-evaluation-harness
was demystified, explaining the output format required forarc_easy
tasks and discussing the significance of BPC (bits per character) as an intelligent proxy correlating with a modelās compression capacity. Concerning tasks like ARC, a dialogue ensued about why random guessing results in a roughly 25% accuracy rate due to its four possible answers. -
Multi-Modal Learning Gains Traction: The possibility of Total Correlation Gain Maximization (TCGM) for semi-supervised multi-modal learning received attention, with one arXiv paper discussing the informational approach and the ability to utilize unlabeled data across modalities effectively. Emphasis was also given to the methodās theoretical promises and its implications in identifying Bayesian classifiers for diverse learning scenarios.
-
Concrete Guidelines for FLOPS Calculation: On the
#scaling-laws
channel, advice was given on estimating the FLOPS for a model such as SoundStream, including using the equation 6 * # of parameters for transformers during forward and backward passes. Newcomers are directed to a comprehensive breakdown in Section 2.1 of the relevant paper for a complete understanding of computational cost estimation.
HuggingFace Discord
-
IDEFICS-2 Takes the Limelight: The release of IDEFICS-2 brings an impressive skill set with 8B parameters, capable of high-resolution image processing and excelling in visual question answering and document retrieval tasks. Anticipation builds as a chat-focused variant of IDEFICS-2 is promised, while current capabilities such as solving complex CAPTCHAs are demonstrated in a shared example.
-
Knowledge Graphs Meet Chatbots: An informative blog post highlights the integration of Knowledge Graphs with chatbots to boost performance, with exploration encouraged for those interested in advanced chatbot functionality.
-
Snowflakeās Arctic Expedition: Snowflake breaks new ground with the launch of the Arctic embed family of models, claimed to set new benchmarks in practical text-embedding model performance, particularly in retrieval use cases. This development is complemented by a hands-on Splatter Image space for creating splatter art quickly, and how Multi-Modal RAG fuses language and images, as detailed in LlamaIndex documentation.
-
Model Training and Comparisons Drive Innovation: A fresh IP-Adapter Playground is unveiled, further enabling creative text-to-image interactions, alongside a new option to
push_to_hub
directly in the transformers libraryās pipelines. Comparing image captioning models just got easier with a dedicated Hugging Face Space. -
Challenges and Opportunities in NLP and Vision: Community members discuss issues from truncated token handling in prompts to exploring LoRA configurations, with links shared to resources on topic modeling with BERTopic, training T5 models (Github Resource), and LaTeX-OCR possibilities for equation conversion LaTeX-OCR GitHub. These conversations encapsulate the collective pursuit of refining and harnessing AI capabilities.
OpenAccess AI Collective (axolotl) Discord
Idefics2 Brings Multimodal Flair: The new multimodal model Idefics2 has been introduced, capable of processing both text and images with improved OCR and visual reasoning skills. It is offered in both base and fine-tuned forms and is under the Apache 2.0 license.
RTX 5090 Speculation Stokes Anticipation: NVidia is rumored to be considering an expedited release of the RTX 5090, potentially at Computex 2024, to stay ahead of AMDās advances, sparking discussions on hardware suitability for cutting-edge AI models.
Model Training Finetuning: Engineers shared insights on model training configurations, focusing on the ātrain_on_inputā parameter in loss calculation, and suggested using āTinyLlama-1.1B-Chat-v1.0ā for fine-tuning small models for efficient experimentation.
Phorm AI Becomes Go-To Resource: Members referred to Phorm AI for various inquiries, including epoch-wise saving techniques and data preparation for models like TinyLlama for tasks like text-to-color code predictions.
Spam Flood Triggers Alerts: Multiple channels within the community were targeted by spam messages promoting OnlyFans content, attempting to divert attention from the AI-centric conversations and technical discourse.
Latent Space Discord
LLM Ranking Resource Revealed: A comprehensive website, LLM Explorer, has been shared, showcasing a plethora of open-source language models, each assessed through ELO scores, HuggingFace leaderboard ranks, and task-specific accuracy metrics, serving as a valuable resource for model comparison and selection.
AI+Human Symphony in the Gig Economy: The launch of Payman AI, a platform facilitating AI agents to remunerate humans for tasks beyond AI capabilities, has sparked interest; the concept promotes a cooperative ecosystem between AI and human talents in domains like design and legal services.
Supabase Embraces AI Inference: Supabase introduces a simple API for running AI inferences within its edge functions, allowing AI models such as gte-small
to be employed directly in databases, as detailed in their announcement.
Buzz Around āLlama 3ā and OpenAI API Moves: The AI community is abuzz about the mysterious āLlama 3ā speculated to debut at a London hackathon, and OpenAIās Assistants API enhancements are drawing attention in light of a potential GPT-5 release, stirring debates about possible impacts on AI startups and platforms.
BloombergGPT Paper Club Session Goes Zoom: The LLM Paper Club invites engineers to a Zoom session on BloombergGPT, due to prior challenges with Discord screensharing, and the discussion has pivoted to Zoom for a better sharing experience. Participants can register for the event here, and further reminders to join the discussions are being circulated within the community.
OpenInterpreter Discord
-
AI Wearable Woes: AI wearables lack the contextual knowledge of smartphones, as discussed with reference to a YouTube review by Marquis Brownlee. Engineers pointed out that greater contextual understanding is necessary for AI assistants to provide efficient responses.
-
Open-Source AI Model Buzz: The WizardLm2 open-source model garners interest for its potential to deliver GPT-4-like capabilities. Discussions forecast a strong future demand despite ongoing advancements.
-
Translator Botās Inclusive Promise: Engineers are currently evaluating a new translation bot for its ability to enrich communication by providing two-way translations, aiming for more inclusive and unified discussions.
-
Cross-Platform Compatibility Challenges: Thereās a clear need for software like 01 Light to operate on Windows, consistent with dialogues about difficulties adapting Mac-centric software to Windows frameworks, thereby hinting at the necessity for platform-agnostic development approaches.
-
Hardware Heats Up: Conversations indicate significant interest in AI hardware solutions like the Limitless device, with comparisons drawn around user experiences. Emphasis on the need for robust backend support and seamless AI integration is shaping hardware aspirations.
Interconnects (Nathan Lambert) Discord
Big Win for qwen-1.5-0.5B: The qwen-1.5-0.5B modelās winrate soared from 4% to 32% against heavyweights like AlpacaEval using generation in chunks. This approach, along with a 300M reward model, may be a game-changer in output searching.
How To Win Friends and Influence AIs: The recently unveiled Mixtral 8x22B, a polyglot SMoE model, is sharing the limelight owing to its impressive capabilities and the Apache 2.0 open license. Meanwhile, the rise of OLMo 1.7 7B indicates a notable stride in language model science with a robust performance leap on the MMLU benchmark.
Replicating Chinchilla: An Anomaly: Discrepancies in replicating the Chinchilla scaling paper by Hoffmann et al. have cast doubts around the paperās findings. The communityās reaction ranged from confusion to concern, signaling an escalating drama around the challenge of scaling law verification.
Lighthearted Anticipation and Rumination: With playful banter on potential showdowns in olmo vs llama, community members show humor in competition. Moreover, Nathan Lambert teases the guild with a forecast of content deluge, signaling a possibly intense week of knowledge sharing.
Model Madness or Jocularity?: A side comment in an underpopulated channel by Nathan mentioned a potential tease involving WizardLM 2 as a troll, showing a blend of humor and light-heartedness amidst technical discussions.
Cohere Discord
-
API Confusion Needs Resolving: Engineers are probing the Cohere API for details on system prompt capabilities and available models. A user highlighted the request for details due to their significance in application development.
-
Benchmarking Cohereās Embeddings: There is curiosity about how Cohereās embeddings v3 perform against OpenAIās new large embeddings with reference to the Cohere blog, suggesting a comparative analysis has been conducted Introducing Command R+.
-
Integration Tips and Tricks: Technical discussions addressed integrating Language Learning Models (LLMs) with platforms like BotPress, and whether Coral necessitates a local hosting solution. Future updates might simplify these integrations.
-
Fine-Tuning Fine-Tuned Models: Clarification was sought about fine-tuning already customized models via Cohereās Web UI, directing users to the official guide Fine-Tuning with the Web UI.
-
Beta Testers Called to Action: A project named Quant Fino is recruiting beta testers for its Agentic entity that merges GAI with FinTech. Interested participants can apply at Join Beta - Quant Fino.
-
Security Flaws Exposed in AI Model: A redteaming exercise revealed vulnerabilities in Command R+, demonstrating the ability to manipulate the model into creating unrestricted agents. Concerned engineers and researchers can review the full write-up Creating unrestricted AI Agents with Command R+.
LangChain AI Discord
AI Documentation Gets Facelift: In an effort to improve usability, contributors to the LangChain documentation are revamping its structure, introducing categories like ātutorialā, āhow to guidesā, and āconceptual guideā. A member shared the LangChain introduction page, emphasizing LangChainās components such as building blocks, LangSmith, and LangServe, which aid in the development and deployment of applications with large language models.
Building with LangChain ā An Expressive Endeavor?: Within the #general channel, a member sought advice on YC startup applications while drawing parallels to Extensiv, leading to the mention of several entities like Unsloth, Mistral AI, and Lumini. Simultaneously, challenges with LangServe integration when combined with Nemo Guardrails were highlighted due to Nemoās transformation of output structures.
Forge Ahead with New AI Tools and Services: GalaxyAIās debut of an API service with complimentary access to GPT-4 and GPT-3.5-turbo stirred up interest, showcased at Galaxy AI. Similarly, OppyDevās fusion of an IDE and a chat client received attention, advocating an improved coding platform accessible at OppyDev AI. Meanwhile, Rubiks.ai appealed to tech enthusiasts to beta test their search engine and assistant at Rubiks.ai using code RUBIX
.
AI Pioneers Share Educational Resources and Seek Collaboration: A member from #tutorials posted a YouTube tutorial on granting AI agents with long-term memory, igniting a discussion why ālanggraphā wasnāt employed. Furthermore, a participant expressed eagerness to collaborate on new projects, inviting others to connect through direct messaging.
Diverse Dialogues on Data and Optimization: In a lively exchange, strategies for optimizing RAG (Retrieval-Augmented Generation) with large documents were evaluated, including document splitting. Members also dialogued over the best methods to manipulate CSV files with Langchain, suggesting improvements for chatbots and data processing.
DiscoResearch Discord
- 64 GPUs Engaged for Full-Scale Deep-Speed: Maxidl pushed the limits by utilizing 64 80GB GPUs, each at 77GB capacity, to run full-scale deep-speed with 32k sequence length and batch size of one, exploring 8-bit optimization for better memory efficiency.
- FSDPās Memory Usage Secrets Unlocked: jp1 suggested
fsdp_transformer_layer_cls_to_wrap: MixtralSparseMoeBlock
, and settingoffload_params = true
to minimize memory usage, potentially reducing GPU requirements to 32, while maxidl sought out calculators for memory usage, referencing a HuggingFace discussion. - Copyright Conundrum for Text Scraping: A member pointed out the EU copyright gray area affecting text data scraping and suggested DFKI as a useful source. Meanwhile, multimodal data from Wikicommons and others are found on Creative Commons Search.
- Tokenization Techniques on the Rise: The community shared insights into creating a Llama tokenizer without HuggingFace, noted a misspelling in a shared custom tokenizer, and highlighted Mistralās new tokenization library, with a GitHub notebook provided.
- Decoding Strategies and Sampling Techniques Evaluated: Concerns that a paper on decoding methods overlooked useful strategies led to a discussion on unaddressed techniques like MinP/DynaTemp/Quadratic Sampling. A Reddit post showed the impact of min_p sampling on creative writing, boosting scores by +8 in alpaca-eval style elo and +10 in eq-bench creative writing test.
tinygrad (George Hotz) Discord
Int8 Integration in Tinygrad: Tinygrad has been confirmed to support INT8 computations, with recognition that such data type support often depends more on hardware capabilities rather than the software design itself.
Graph Nirvana with Tiny-tools: For enhanced graph visualizations in Tinygrad, users can visit Tiny-tools Graph Visualization to create slicker graphs than the basic GRAPH=1
setting.
Pytorch-Lightningās Hardware Adaptability: Discussions about Pytorch-Lightning touched on its hardware-agnostic capabilities, with practical applications noted on hardware like the 7900xtx. Discover Pytorch-Lightning on GitHub.
Tinygrad Meets Metal: Community members are exploring the generation of Metal compute shaders with tinygrad, discussing how to run simple Metal programs without Xcode and the possibility of applying this to meshnet models.
Model Manipulation and Efficiency in Tinygrad: A memberās proposal for a fast, probabilistically complete Node.equals() prompted discussions on efficiency, while George Hotz explained layer device allocation, and users were directed toward tinygrad/shape/shapetracker.py or view.py for zero-cost tensor manipulations like broadcast and reshape.
Skunkworks AI Discord
- Hugging Face Showcases Idefics2: Hugging Face introduces Idefics2, a new multimodal ChatGPT iteration that integrates Python coding capabilities, as demonstrated in their latest video.
- Reka Core Rivals Tech Behemoths: Touted for its performance, Reka Core emerges as a strong competitor to language models from OpenAI and others, with a video overview available to showcase its capabilities.
- JetMoE-8B Flaunts Efficient AI Performance: The JetMoE-8B model impresses with performance that surpasses Meta AIās LLaMA2-7B while costing under $0.1 million, suggesting a cost-efficient approach to AI development as explained in this breakdown.
- Snowflake Announces Premier Text-Embedding Model: Snowflake debuts the Snowflake Arctic embed family of models, claiming the title for the worldās most effective practical text-embedding model, detailed in their announcement.
Datasette - LLM (@SimonW) Discord
- Mixtral Mania: Engineers are eagerly awaiting to test the Mixtral 8x22B Instruct model; for those interested, the Model Card on HuggingFace is now available.
- Glitch in the Machine: Thereās a reported installation error for llm-gpt4all that seems to obstruct usage; details of the problem can be found in the GitHub issue tracker.
Alignment Lab AI Discord
- Legal Entanglements Afoot?: A member hinted at possible legal involvement in an unspecified situation, yet no context was provided to ascertain the details or nature of the legal matters in question.
- The Misfortune of wizardlm-2: An image was shared showing the deletion of wizardlm-2, noted specifically for lack of testing on v0; the intricacies of wizardlm-2 or the testing processes were not elaborated. View Image
Mozilla AI Discord
-
Llamafile Script Gets a Facelift: An improved repacking script for the llamafile archive version upgrade is now accessible via this Gist, triggering a discussion on whether to merge it with the main GitHub repo or to start new llamafiles from scratch due to concerns about maintainability.
-
Seeking Protocol for Security Flaws: The discussion surfaced a need for clarification on the procedure to report security vulnerabilities within the system, including the steps to request a CVE number, although specific guidance is currently lacking.
The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
PART 2: Detailed by-Channel summaries and links
Stability.ai (Stable Diffusion) ā· #announcements (1 messages):
-
Stable Diffusion 3 Launch Celebration: Stable Diffusion 3 and its faster variant, Stable Diffusion 3 Turbo, are now available on the Stability AI Developer Platform API. This release is powered through a partnership with Fireworks AI, boasting claims of being the fastest and most reliable API platform.
-
Open Generative AI Continues: There is a plan to make Stable Diffusion 3 model weights available for self-hosting, which would require a Stability AI Membership, emphasizing the continued commitment to open generative AI.
-
Discover More About SD3: Users are directed to learn more and get started with the new offerings through the provided link, which includes further details and documentation.
-
Research Background Unpacked: According to the Stable Diffusion 3 research paper, this iteration rivals or surpasses the leading text-to-image systems like DALL-E 3 and Midjourney v6 in aspects such as typography and adherence to prompts, based on human preference studies.
-
Technical Advancements in SD3: The latest version introduces the Multimodal Diffusion Transformer (MMDiT) architecture, offering improved text comprehension and image representation over previous Stable Diffusion models by utilizing distinct weight sets for different modalities.
Link mentioned: Stable Diffusion 3 API Now Available ā Stability AI: We are pleased to announce the availability of Stable Diffusion 3 and Stable Diffusion 3 Turbo on the Stability AI Developer Platform API.
Stability.ai (Stable Diffusion) ā· #general-chat (1039 messagesš„š„š„):
-
SD3 Awaits Membership Clarification: Amidst the concerns of licensing and accessibility, users await a clear statement from Stability AI regarding SD3ās availability for personal and commercial use. Discussions arose following an announcement stating plans to make the model weights available for self-hosting with a Stability AI Membership.
-
SDXL Refiners Deemed Redundant: The community finds SDXL finetunes to have made the use of SDXL refiners obsolete, stating that refiner-trained finetunes have taken precedence in Civitai downloads. Some users reminisce about initial uses of refiners but acknowledge that finetune integrations quickly replaced the need for them.
-
Model Merging Challenges: Users explore the effectiveness and understanding of model-merging concepts around V-prediction and epsilon in ComfyUI. Thereās debate on the necessity of correct implementation to avoid unpredictable results, with recommendations to gain minimal knowledge through UI experimentation.
-
Diffusers Pipeline Limitations: Some users point out limitations in the diffusers pipeline requiring Hugging Face dependency, yet others contend that once models are downloaded, the process can run independently and efficiently on local systems. Concerns are raised about the inaccessibility of
StableVideoDiffusionPipeline.from_single_file(path)
method in SVD finetunes, suggesting Comfy UI as an easier alternative.
Links mentioned:
- Video Examples: Examples of ComfyUI workflows
- Model Merging Examples: Examples of ComfyUI workflows
- Stable Cascade - a Hugging Face Space by multimodalart: no description found
- PixArt Sigma - a Hugging Face Space by PixArt-alpha: no description found
- camenduru/SUPIR Ā· Hugging Face: no description found
- Stable Diffusion 3 API Now Available ā Stability AI: We are pleased to announce the availability of Stable Diffusion 3 and Stable Diffusion 3 Turbo on the Stability AI Developer Platform API.
- Membership ā Stability AI: The Stability AI Membership offers flexibility for your generative AI needs by combining our range of state-of-the-art open models with self-hosting benefits.
- Stable Video Diffusion: no description found
- GitHub - kijai/ComfyUI-SUPIR: SUPIR upscaling wrapper for ComfyUI: SUPIR upscaling wrapper for ComfyUI. Contribute to kijai/ComfyUI-SUPIR development by creating an account on GitHub.
- WizardLM/WizardLM-2 at main Ā· victorsungo/WizardLM: Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder - victorsungo/WizardLM
- Reddit - Dive into anything: no description found
- GitHub - king159/svd-mv: Training code for Stable Video Diffusion Multi-View: Training code for Stable Video Diffusion Multi-View - king159/svd-mv
- Reddit - Dive into anything: no description found
- GitHub - BatouResearch/magic-image-refiner: Contribute to BatouResearch/magic-image-refiner development by creating an account on GitHub.
- Fix ELLA timesteps by kijai Ā· Pull Request #25 Ā· ExponentialML/ComfyUI_ELLA: I have been comparing the results from this implementation to the diffusers implementation, and it's not on par. In diffusers ELLA is applied on each timestep, with the actual timestep value. Appl...
- Pixel Art XL - v1.1 | Stable Diffusion LoRA | Civitai: Pixel Art XL Consider supporting further research on Ko-Fi or Twitter If you have a request, you can do it via Ko-Fi Checkout my other models at Re...
- GitHub - kijai/ComfyUI-KJNodes: Various custom nodes for ComfyUI: Various custom nodes for ComfyUI. Contribute to kijai/ComfyUI-KJNodes development by creating an account on GitHub.
- GitHub - city96/ComfyUI_ExtraModels: Support for miscellaneous image models. Currently supports: DiT, PixArt, T5 and a few custom VAEs: Support for miscellaneous image models. Currently supports: DiT, PixArt, T5 and a few custom VAEs - city96/ComfyUI_ExtraModels
- Add node to use SD3 through API Ā· kijai/ComfyUI-KJNodes@22cf8d8: no description found
- GitHub - lllyasviel/stable-diffusion-webui-forge: Contribute to lllyasviel/stable-diffusion-webui-forge development by creating an account on GitHub.
- Comfy Workflows: Share, discover, & run thousands of ComfyUI workflows.
Unsloth AI (Daniel Han) ā· #general (383 messagesš„š„):
- GPT-4 and GPT-3.5 Clarification: A distinction was made between GPT-4 and GPT-3.5, noting that the newer version appears to be a fine-tuned iteration of its predecessor.
- Mistral Model Multilingual Capabilities Discussed: Members discussed whether datasets for Mistral7B need to be in English to perform well, with advice given to include French data for better results.
- Finetuning and Cost Concerns Addressed: A discussion about finetuning methods, costs, and specific resources like notebooks provided insights for those new to the domain. It was suggested that continued pretraining and sft could be beneficial and cost-effective.
- Concerning UnSloth Contributions: Members expressed interest in contributing to UnSloth AI, offering help in expanding documentation and considering donations, with links to existing resources and discussions on potential contributions shared.
- Mixtral 8x22B Release Excitement: The release of Mixtral 8x22B, a sparse Mixture-of-Experts model with strengths in multilingual fluency and long context windows, sparks discussions due to its open-sourcing under the Apache 2.0 license.
Links mentioned:
- Cheaper, Better, Faster, Stronger: Continuing to push the frontier of AI and making it accessible to all.
- no title found: no description found
- Google Colaboratory: no description found
- lucyknada/microsoft_WizardLM-2-7B Ā· Hugging Face: no description found
- Placa de VĆdeo Galax NVIDIA GeForce RTX 3090 TI EX Gamer, 24GB GDDR6X, 384 Bits - 39IXM5MD6HEX: Placa De Video Galax GeforceTorne sua rotina diĆ”ria mais fluĆda Assine o Prime Ninja e tenha promoƧƵes exclusivas desconto no frete e cupons em dobro
- Google Colaboratory: no description found
- Home: 2-5X faster 80% less memory LLM finetuning. Contribute to unslothai/unsloth development by creating an account on GitHub.
- gist:e45b337e9d9bd0492bf5d3c1d4706c7b: GitHub Gist: instantly share code, notes, and snippets.
- Full fine tuning vs (Q)LoRA: ā”ļø Get Life-time Access to the complete scripts (and future improvements): https://trelis.com/advanced-fine-tuning-scripts/ā”ļø Runpod one-click fine-tuning te...
- mistralai/Mixtral-8x22B-Instruct-v0.1 Ā· Hugging Face: no description found
- Support for x86/ARM CPUs (e.g., Xeon, M1) · Issue #194 · openai/triton: Hi there, Is there any future plan for macOS support? ⯠pip install -U --pre triton DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in...
- Ollama.md Documentation by jedt Ā· Pull Request #3699 Ā· ollama/ollama: A guide on setting up a fine-tuned Unsloth FastLanguageModel from a Google Colab notebook to: HF hub GGUF local Ollama Preview link: https://github.com/ollama/ollama/blob/66f7b5bf9e63e1e98c98e8f4...
Unsloth AI (Daniel Han) ā· #random (27 messagesš„):
-
Chroma Project Takes a Leap: Inspired by unsloth AI strategies, a member announces the development of an edge version of Chroma written in Go, using SQLite for on-device vector storage. The project, which is also compatible with browsers via WASM, is accessible on GitHub.
-
Smileys Invade the Bottom Page: A heartwarming mini-discussion about cute smiley faces at the bottom of a page, highlighting a particular mustache smiley as a favorite.
-
PyTorchās New Torchtune: Mention of Torchtune, a native PyTorch library for LLM fine-tuning that has been shared on GitHub, sparking interest due to its potential to make fine-tuning more accessible.
-
Unsloth AIās Broad GPU Support Praised: A member congratulates Unsloth for its broad GPU support, which makes it more accessible compared to other tools that require newer GPU architectures.
-
Mobile Deployment of AI Models Discussed: Members discuss the feasibility of running neural networks on mobile phones, identifying the need for custom inference engines and noting the absence of CUDA on mobile devices. The challenges of running typical DL Python code on iPhones versus Macs with M chips are also mentioned.
Links mentioned:
- GitHub - pytorch/torchtune: A Native-PyTorch Library for LLM Fine-tuning: A Native-PyTorch Library for LLM Fine-tuning. Contribute to pytorch/torchtune development by creating an account on GitHub.
- GitHub - l4b4r4b4b4/go-chroma: Go port of Chroma vector storage: Go port of Chroma vector storage. Contribute to l4b4r4b4b4/go-chroma development by creating an account on GitHub.
Unsloth AI (Daniel Han) ā· #help (275 messagesš„š„):
- Questions About Unsupported Attributes: A user encountered an
AttributeError
when trying to fine-tune a model, reporting that the'MistralSdpaAttention'
object has no attribute'temp_QA'
. It seems to be related to a specific method within their custom training pipeline. - ORPO Support and Usage Clarified: Users inquired about ORPO support in Unsloth. Itās confirmed that ORPO is supported, referenced by links to a model trained using ORPO on HuggingFace and a colab notebook.
- Discussions on LoRA and rslora: Users discussed using LoRA and rslora in training, with advice on handling different
alpha
values and potential loss spikes. Some members suggested adjustingr
andalpha
and disabling packing as possible solutions to training issues. - Embedding Tokens Not Trained: Users touched on the subject of embedding tokens that were not trained in the Mistral model, in the context of whether it is possible to train these embeddings during fine-tuning.
- Saving and Hosting Models: Questions arose about saving finetuned models in different formats using commands like
save_pretrained_merged
andsave_pretrained_gguf
; whether they work sequentially and the need to start with fp16 first. There was also a query about hosting a model with GGUF files on the HuggingFace inference API.
Links mentioned:
- Find Open Datasets and Machine Learning Projects | Kaggle: Download Open Datasets on 1000s of Projects + Share Projects on One Platform. Explore Popular Topics Like Government, Sports, Medicine, Fintech, Food, More. Flexible Data Ingestion.
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- G-reen/EXPERIMENT-ORPO-m7b2-1-merged Ā· Hugging Face: no description found
- Google Colaboratory: no description found
- Google Colaboratory: no description found
- Google Colaboratory: no description found
- Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation): Things I Learned From Hundreds of Experiments
- Unsloth - 4x longer context windows & 1.7x larger batch sizes: Unsloth now supports finetuning of LLMs with very long context windows, up to 228K (Hugging Face + Flash Attention 2 does 58K so 4x longer) on H100 and 56K (HF + FA2 does 14K) on RTX 4090. We managed...
- Tokenization | Mistral AI Large Language Models: Tokenization is a fundamental step in LLMs. It is the process of breaking down text into smaller subword units, known as tokens. We recently open-sourced our tokenizer at Mistral AI. This guide will w...
- Home: 2-5X faster 80% less memory LLM finetuning. Contribute to unslothai/unsloth development by creating an account on GitHub.
- A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA: As large language models (LLMs) have become increasingly compute and memory intensive, parameter-efficient fine-tuning (PEFT) methods are now a common strategy to fine-tune LLMs. A popular PEFT method...
- Rank-Stabilized LoRA: Unlocking the Potential of LoRA Fine-Tuning: no description found
- Installation: no description found
- no title found: no description found
- Installation: no description found
- Home: 2-5X faster 80% less memory LLM finetuning. Contribute to unslothai/unsloth development by creating an account on GitHub.
- GitHub - comfyanonymous/ComfyUI: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. - comfyanonymous/ComfyUI
- Add ORPO example notebook to the docs Ā· Issue #331 Ā· unslothai/unsloth: It's possible to use the ORPOTrainer from TRL with very little modification to the current DPO notebook. Since ORPO reduces the resources required for training chat models even further (no separat...
- Tokenizer: no description found
- GitHub - ggerganov/llama.cpp: LLM inference in C/C++: LLM inference in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.
- Adding New Vocabulary Tokens to the Models Ā· Issue #1413 Ā· huggingface/transformers: ā Questions & Help Hi, How could I extend the vocabulary of the pre-trained models, e.g. by adding new tokens to the lookup table? Any examples demonstrating this?
Unsloth AI (Daniel Han) ā· #showcase (46 messagesš„):
- Clarification on Leaderboard Model Templates: A member asked how the leaderboard knows the model template. It was clarified that the modelās
tokenizer.chat_template
is used to inform the leaderboard. - ShareGPT90k Dataset Cleaned and Formatted: A new version of the ShareGPT90k dataset has been cleaned of HTML tags and is available in chatml format on Hugging Face, allowing users to train with Unsloth. Dataset ready for action.
- Ghost Model Training Intrigue: Members engaged in a detailed conversation about what constitutes a ārecipeā for training AI models. One member is particular about needing a detailed recipe that leads to creating a specific model with defined characteristics and not just a set of tools or methods.
- Recipes vs. Tools in AI Model Training: The conversation continued on the difference between a full ārecipeā including datasets and specific steps, as opposed to tools and methods. One member shared their approach, underlying the importance of data quality and replication of existing models, referencing the Dolphin model card on Hugging Face.
- Recommender Systems vs. NLP Challenges and Expertise: A PhD candidate discussed the differences and similarities between working on NLP and developing recommender systems, highlighting the unique challenges and expertise required in the latter which includes handling noise in data, induction biases, and significant feature engineering.
Links mentioned:
- Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation): Things I Learned From Hundreds of Experiments
- Nice Click Nice GIF - Nice Click Nice Man - Discover & Share GIFs: Click to view the GIF
- Aligning LLMs with Direct Preference Optimization: In this workshop, Lewis Tunstall and Edward Beeching from Hugging Face will discuss a powerful alignment technique called Direct Preference Optimisation (DPO...
- Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math: In this video I will explain Direct Preference Optimization (DPO), an alignment technique for language models introduced in the paper "Direct Preference Opti...
- FractalFormer: A WIP Transformer Architecture Inspired By Fractals: Check out the GitHub repo herehttps://github.com/evintunador/FractalFormerSupport my learning journey on patreon!https://patreon.com/Tunadorable?utm_medium=u...
- llama-recipes/recipes at 0efb8bd31e4359ba9e8f52e8d003d35ff038e081 Ā· meta-llama/llama-recipes: Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & ...
- llama-recipes/recipes/multilingual/README.md at 0efb8bd31e4359ba9e8f52e8d003d35ff038e081 Ā· meta-llama/llama-recipes: Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & ...
- Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases: Complex logical query answering (CLQA) is a recently emerged task of graph machine learning that goes beyond simple one-hop link prediction and solves a far more complex task of multi-hop logical reas...
- LLM Phase Transition: New Discovery: Phase Transitions in a dot-product Attention Layer learning, discovered by Swiss AI team. The study of phase transitions within the attention mechanisms of L...
- pacozaa/sharegpt90k-cleanned Ā· Datasets at Hugging Face: no description found
- After 500+ LoRAs made, here is the secret: Well, you wanted it, here it is: The quality of dataset is 95% of everything. The rest 5% is not to ruin it with bad parameters. Yeah, I know,...
Unsloth AI (Daniel Han) ā· #suggestions (15 messagesš„):
- Exploring Multilingual Model Approaches: A member brought up the issue of catastrophic forgetting in multilingual models trained on languages like Hindi or Thai. They proposed a two-phase solution involving translating questions to English, using a large English model for answering, and then translating back to the original language, questioning the drawbacks of this method.
- Multilingual Expansion Through MOE: Another member expressed excitement about the possibility of using MoE (Mixture of Experts) to expand multilingual capabilities of models, anticipating it would āopen so many doors!ā
- Torchtune Gains Enthusiasm: The community shows interest in Torchtune, an alternative to the abstractions provided by Hugging Face and Axolotl, highlighting its potential to streamline the fine-tuning process. There is also a hint at possible collaborations involving Unsloth AI.
- Contemplating Language Mixing in Datasets: In response to the splitting of translation and question-answering tasks, a member considered the possibility of combining multiple languages into a single dataset for model training and using a strategy that involves priming the model with Wikipedia articles.
- Double-Translation Mechanism Discussed: A concept articulated as
translate(LLM(translate(instruction)))
was proposed and discussed, supporting the idea of using a larger, more robust English language model in tandem with translation layers to process non-English queries. Concerns about the added cost due to multiple model calls were raised.
LM Studio ā· #š¬-general (175 messagesš„š„):
- Repeat AI Responses Challenge: A member asked how to prevent AI from repeating the same information during a conversation, specifically using Dolphin 2 Mistral. They also inquired about what āmulti-turn conversationsā are, to which another member linked an article explaining the concept in relation to bots.
- WizardLM-2 LLM Announced: An announcement for the new large language model family was shared, featuring WizardLM-2 8x22B, 70B, and 7B. Links to a release blog and model weights on Hugging Face were included, with members discussing its availability and performance.
- Understanding Tool Differences: One user asked for the differences between ollama and LMStudio, and it was explained that both are wrappers for llama.cpp, but LM Studio is GUI based and easier for beginners.
- Fine-Tuning and Agents Discussion: There was a discussion on whether itās worth learning tools like langchain depending on needs and use cases, with some suggesting it can be a hindrance if venturing outside its default settings.
- File Management and API Interactions in LM Studio: A new member inquired about relocating downloaded app files and interfacing LM Studio with an existing API. It was clarified that models cannot change default install locations, and files can be found under the My Models tab for relocating. No specific method for API interaction through LM Studio was mentioned.
Links mentioned:
- RAM Latency Calculator: no description found
- lmstudio-community/WizardLM-2-7B-GGUF Ā· Hugging Face: no description found
- Multi-turn conversations - QnA Maker - Azure AI services: Use prompts and context to manage the multiple turns, known as multi-turn, for your bot from one question to another. Multi-turn is the ability to have a back-and-forth conversation where the previous...
- Mission Squad. Flexible AI agent desktop app.: no description found
- Tweet from WizardLM (@WizardLM_AI): š„Today we are announcing WizardLM-2, our next generation state-of-the-art LLM. New family includes three cutting-edge models: WizardLM-2 8x22B, 70B, and 7B - demonstrates highly competitive performa...
- GitHub - hiyouga/LLaMA-Factory: Unify Efficient Fine-Tuning of 100+ LLMs: Unify Efficient Fine-Tuning of 100+ LLMs. Contribute to hiyouga/LLaMA-Factory development by creating an account on GitHub.
- Microsoftās Punch in the Face to Open AI (Open Source & Beats GPT-4): WizardLM 2 is a groundbreaking family of large language models developed by Microsoft that push the boundaries of artificial intelligence.ā¼ Link(s) From Toda...
- GitHub - Unstructured-IO/unstructured: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines. - GitHub - Unstructured-IO/unstructured: Open source librar...
LM Studio ā· #š¤-models-discussion-chat (96 messagesš„š„):
- Template Troubles with WizardLM 2: Members reported issues with the WizardLM 2 and the Vicuna 1.5 preset, where the bot generated inputs for the user instead. A suggested solution included adjusting the rope frequency to 1 or setting
freq_base
to 0, which appeared to correct the behavior. - Mixed Opinions on WizardLM 2 and Wavecoder: While some users expressed a high opinion of WizardLM 2, claiming it performed well even compared to other 7B models, others judged the performance as subpar, not noticing any significant improvement even after fine-tuning.
- Exploring Best Quantization Practices: Users discussed the effectiveness of different quantization levels for 7B models, comparing Q8 to Q6K quality. The consensus leaned towards higher quantization being more desirable if one has sufficient VRAM, while acknowledging the utility of smaller models for certain tasks.
- Model Performance Debate: There was a spirited discussion around the relative superiority of models, with focus on parameter count versus quantization level, and the belief that fine-tuning and quality of the training can be deciding factors over just the size of the modelās parameters.
- Finding the Right Code Generator: A user experienced difficulties with the code-generating capabilities of WaveCoder-Ultra-6.7B, receiving messages that it couldnāt write complete applications. Tips offered included using assertive prompts and adjusting the context window size for the model to load appropriately.
Links mentioned:
- lmstudio-community/wavecoder-ultra-6.7b-GGUF Ā· Hugging Face: no description found
- mistralai/Mixtral-8x22B-Instruct-v0.1 Ā· Hugging Face: no description found
- Virt-io/Google-Colab-Imatrix-GGUF at main: no description found
- High Quality / Hard to Find - a DavidAU Collection: no description found
- lmstudio-community/WizardLM-2-7B-GGUF Ā· Hugging Face: no description found
- Responses: These are answers to the prompt by two different LLMs. You are going to analyze Factuality Depth Level of detail Coherency <any other area that I might have missed but is generally considered impo...
- bartowski/zephyr-orpo-141b-A35b-v0.1-GGUF at main: no description found
LM Studio ā· #š§ -feedback (4 messages):
-
Model Loading Error in Action: A user encountered an error loading model architecture when trying out Wirard LLM 2 on LM Studio across different model sizes, including 2 4bit and 6bit, prompting a Failed to load model message.
-
Fix Suggestion for Model Loading: Another user recommended ensuring the use of GGUF quants and also noted that version 0.2.19 is required for WizardLM2 models to function properly.
-
Request for stable-diffusion.cpp: A request was made to add stable-diffusion.cpp to LM Studio to enhance the softwareās capabilities.
LM Studio ā· #š-prompts-discussion-chat (17 messagesš„):
- Cleaning Up LM Studio: Users with issues were advised to delete specific LM Studio folders such as
C:\Users\Username\.cache\lm-studio
,C:\Users\Username\AppData\Local\LM-Studio
, andC:\Users\Username\AppData\Roaming\LM Studio
. Itās crucial to backup models and important data prior to deletion. - Prompt Crafting for NexusRaven: A user inquired if anyone has experimented with NexusRaven and devised any prompt presets for it, indicating interest in collective knowledge-sharing.
- Script Writing with AI: One member asked how to make the AI output a full script, suggesting they are searching for tips on generating longer content.
- Compatibility Issues with Hugging Face Models: A user noted problems with running certain Hugging Face models, like
changge29/bert_enron_emails
andktkeller/mem-jasper-writer-testing
, in LM Studio. Assistance with running these models was sought. - Seeking Partnership for Affiliate Marketing: A user indicated interest in finding a partner with coding expertise for help with affiliate marketing campaigns, mentioning a willingness to share profits if successful. The user emphasized a serious offer for a partnership based on results.
LM Studio ā· #š-hardware-discussion (18 messagesš„):
- GPU Comparison Sheet Quest Continues: User freethepublicdebt was searching for an elusive Google sheet comparing GPUs and could not find the link they worked on. Another user, heyitsyorkie, attempted to help but provided the wrong link leading to further confusion.
- Direct GPU Communication Breakthrough: rugg0064 shared a Reddit post celebrating the success of getting GPUs to communicate directly, bypassing the CPU/RAM and potentially leading to performance improvements.
- Customizing GPU Load in LM Studio: heyitsyorkie provided insight on adjusting the GPU offload for models in LM Studioās Linux beta by navigating to Chat mode -> Settings Panel -> Advanced Config.
- Splitting Workloads Between Different GPUs: In response to a query from .spicynoodle about uneven model allocation between their GPUs, heyitsyorkie suggested modifying GPU preferences json and searching for ātensor_splitā for further guidance.
- SLI and Nvlink Troubles with P100s: ethernova is seeking advice for their setup with dual P100s not showing up in certain software and NVLink status appearing inactive despite having NVLink bridges attached.
Links mentioned:
- George Hotz Geohot GIF - George hotz Geohot Money - Discover & Share GIFs: Click to view the GIF
- Reddit - Dive into anything: no description found
LM Studio ā· #š§Ŗ-beta-releases-chat (31 messagesš„):
- VRAM vs. System RAM in Model Performance: Thereās a discussion on whether a model would run on a system with 24 GB VRam and 96 GB Ram, with one member suggesting that it might run but inference will be incredibly slow due to the speed difference between VRam and system RAM.
- Expectations for WizardLM-2-8x22B: Members compare the WizardLM-2-8x22B to other models like Command R Plus, with mixed experiences. While one member was not impressed with Mixtral 8x22b and plans to test WizardLM-2-8x22B, another mentioned getting satisfactory results with 10+ tokens/sec from WizardLM.
- Model Performance on Different Hardware: Users with an M3 MacBook Pro 128GB report running model q6_k of Command R Plus, achieving about 5 tokens/sec. The speed is considered half as fast as GPT-4 on ChatGPT, but not painfully slow as each token represents a word or subword.
- Base Model Clarification: Clarification on what constitutes a āBaseā model was providedāmodels not fine-tuned for chat or instruct tasks are considered base models, and they are generally found to perform poorly in comparison to their fine-tuned counterparts.
- Model Size and Local Running Feasibility: A conversation about the feasibility of running large models like WizardLM-2-8x22B locally was had, noting that a GPU like a 4090 with 24GB is too small to run such a large model, which runs best on Mac systems with substantial RAM.
LM Studio ā· #amd-rocm-tech-preview (19 messagesš„):
- Curiosity about Windows Executable Signing: A member was curious whether the Windows executables are signed with an Authenticode cert. It was confirmed that they are indeed signed.
- Challenges with Code Signing Certificates: In the context of signing an app, there was a discussion on the cost and process complexities associated with obtaining a Windows certificate, including a comparison to the cost of an Apple developer license.
- Seeking Expertise on Automated Compile and Sign Process: A member expressed interest in understanding the automated process for compiling and signing, offering to compensate for the knowledge exchange.
- AMD HIP SDK System Requirements Clarification: A member provided information about system requirements for GPUs from a link to the AMD HIP SDK system requirements and asked about the stance of LM Studio on supporting certain AMD GPUs not officially supported by the SDK.
- Issues with AMD dGPU Recognition in LM Studio Software: Members discussed an issue where LM Studio software was using an AMD integrated GPU (iGPU) instead of the dedicated GPU (dGPU), with one member suggesting disabling the iGPU in the device manager. Another member stated that version 0.2.19 of the software should have resolved this issue and encouraged to report the problem if it persists.
Links mentioned:
- š¾ LM Studio - Discover and run local LLMs: Find, download, and experiment with local LLMs
- System requirements (Windows) ā HIP SDK installation Windows: no description found
- Bill Gates Chair GIF - Bill Gates Chair Jump - Discover & Share GIFs: Click to view the GIF
LM Studio ā· #model-announcements (3 messages):
- WaveCoder Ultra Unveiled: Microsoft has released WaveCoder ultra 6.7b, finely tuned using their āCodeOceanā. This impressive model specializes in code translation and supports the Alpaca format for instruction following, with examples available on its model card.
- Seeking NeoScript AI Assistant: A community member new to AI has inquired about utilizing models for NeoScript programming, specifically for RAD applications using a platform formerly known as NeoBook. They are seeking suggestions on configuring AI models despite unsuccessful initial attempts using documents as references.
Link mentioned: lmstudio-community/wavecoder-ultra-6.7b-GGUF Ā· Hugging Face: no description found
Nous Research AI ā· #off-topic (17 messagesš„):
- Introducing Multimodal Chat GPTs: A link to a YouTube video titled āIntroducing Idefics2 8B: Open Multimodal ChatGPTā was shared, discussing the development of Hugging Faceās open multimodal language model, Idefics2. Watch it here.
- Reka Core Joins the Multimodal Race: Another YouTube video shared discusses āReka Core,ā a competitive multimodal language model claiming to rival big industry names like OpenAI, Anthropic, and Google. The video can be viewed here.
- Navigating Language and AI: Discussions revolved around the relationship between language, AI, and the concept of the divine, touching on the idea of languages as āenvelopes within the vectorspace of meaningā and the potential linguistic evolution that AI might spur. The conversation included references to general semantics and quantum mereotopology with a hint at looking into Alfred Korzybskiās work.
- Staying Up to Date with AI Research: Members expressed the challenge of keeping up with the vast amount of AI research and literature, admitting to struggles with growing reading backlogs amidst the rapid pace of new publications.
- JetMoE and the Economics of AI: A YouTube video titled āJetMoE: Reaching LLaMA2 Performance with 0.1M Dollarsā highlighting how JetMoE-8B was trained on a budget yet outperforms the more expensive LLaMA2-7B was shared. The video is available here.
Links mentioned:
- Snowflake Launches the Worldās Best Practical Text-Embedding Model: Today Snowflake is launching and open-sourcing with an Apache 2.0 license the Snowflake Arctic embed family of models. Based on the Massive Text Embedding Be...
- Introducing Idefics2 8B: Open Multimodal ChatGPT: We will take a look idefics2 the open multimodal llm by huggingfacehttps://huggingface.co/blog/idefics2#python #pythonprogramming #llm #ml #ai #aritificialin...
- Reka Core: A Frontier Class Multimodal Language Model: Reka Core is competitive with models from OpenAI, Anthropic, and Google across key industry-accepted evaluation metrics. Given its footprint and performance,...
- JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars: JetMoE-8B is trained with less than $ 0.1 million1 cost but outperforms LLaMA2-7B from Meta AI, who has multi-billion-dollar training resources. LLM training...
Nous Research AI ā· #interesting-links (7 messages):
-
Self-Supervised LLM Solution Selection Sprouts: A novel technique for enhancing Instruct Model LLMs is on the table, which utilizes the modelās own capacity to generate and select the most pertinent solution based on its ability to reconstruct the original input from its responses. The method aims at information maximization and offers a scalable, unsupervised evaluation that enhances coherence and relevance, and is adaptable with existing techniques.
-
New Horizons in LLM Medical Alignment: A shared Google Slideshow points towards efforts in aligning Language Models specifically for medical reasoning applications, although the content details are not accessible from the provided message.
-
Mistralās Tokenization Guide Unwrapped: Mistral AI introduces an open-source tokenizer, with a guide discussing the tokenization process, its importance in LLMs, and how to employ their tokenizer within Python.
-
Tempering the Tokenization Hype: A user critiques the emphasis on tokens, arguing that tokens arenāt as critical if the model is already adept at handling tags, suggesting that the true value might be in increased steerability of the model.
-
Tweeting Up a Dev Storm: A link to a Twitter post was shared, but the content of the tweet hasnāt been discussed within the provided messages.
Links mentioned:
- Tokenization | Mistral AI Large Language Models: Tokenization is a fundamental step in LLMs. It is the process of breaking down text into smaller subword units, known as tokens. We recently open-sourced our tokenizer at Mistral AI. This guide will w...
- Aligning LLMs for Medical Reasoning: Aligning Large Language Models to be Better Medical Reasoners Ritabrata Maiti [email protected] 1
Nous Research AI ā· #general (159 messagesš„š„):
-
Mystery Surrounding WizardLMās Takedown: There was confusion about why Microsoftās WizardLM was taken down, with speculation about it being ātoo toxicā and unverified rumors of it being attacked or hacked. A bundle of links and information about WizardLM was shared including its removal and a re-upload mirror.
-
Concerns about the EU AI Act: A theory was put forward that WizardLM had to be taken down as it violated the EU AI act for being almost uncensored, with suggestions to torrent the original version if anyone still has it. However, it was clarified later that it was originally unpublished for not going through Microsoftās new ātoxicity review.ā
-
Excitement and Skepticism for Code Models: Discussion on CodeQwen1.5-7B Chat, a code-specific language model, was lively with members sharing its blog post and GitHub while noting its strong performance on benchmarks like 83.5 on humaneval. There is some skepticism about the model still using vanilla MHA (Multihead Attention) and speculation about potential contamination due to its high performance.
-
Frustrations with Mixed Messages on Model Performance: n8programs shared excitement for improvements to a creative writing model achieving a benchmark score of 70, between Mistral medium and large, using Westlake as a base model. The legitimacy of benchmark comparisons was debated, especially in light of expectations for LLaMa 3 and whether explicit tuning can trump new architectures.
-
Uncertainty about Future Model Releases: Queries about upcoming releases like Hermes 8x22B and whether it would be realistic to run such large models on personal equipment. There is anticipation about potential Llama-3 models and speculation on whether these new models will outperform their predecessors.
Links mentioned:
- senseable/WestLake-7B-v2 Ā· Hugging Face: no description found
- The Bitter Lesson: no description found
- alpindale/WizardLM-2-8x22B Ā· Hugging Face: no description found
- Qwen/CodeQwen1.5-7B-Chat Ā· Hugging Face: no description found
- WizardLM-2 8x22B by microsoft | OpenRouter: WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing ...
- Unlock AI Agent real power?! Long term memory & Self improving: How to build Long term memory & Self improving ability into your AI Agent?Use AI Slide deck builder Gamma for free: https://gamma.app/?utm_source=youtube&utm...
Nous Research AI ā· #ask-about-llms (7 messages):
- Speed Demon: A member mentioned witnessing a performance of 700 Mbps in an unnamed context.
- Diving into State-Space Models: A member sought recommendations for essential papers on recent advances in state-space models for weekend reading.
- Mamba Paper Suggested: In response to a request for recent literature, one member suggested looking into the Mamba paper, while another was more interested in the newer Jamba and related works.
- Hermes 2 Pro Query Handling Issue: A user expressed the need to prevent Hermes 2 Pro from always returning
<tool_call>
when it should sometimes just engage in chat, noting it as a current limitation. - Promising Future Updates: A contributor noted they will collaborate with another member to improve Hermes 2 Proās ability to discern when to use
<tool_call>
and when to just chat in future versions.
Nous Research AI ā· #rag-dataset (10 messagesš„):
-
Debating JSONās Virtue: A message refers to a previous defense for using JSON structure for input-outputs, suggesting that this format might reduce the need for handwaving when explaining processes.
-
Seeking Vision for RAGs: A user expressed interest in the state of the art for vision, especially in the context of building a Retrieval Augmented Generation (RAG) on engineering documents with images and diagrams.
-
Vision SOTA Suggestions: One member touted GPT4v/Geminipro Vision and Claude Sonnet as leading options in the field, recommending testing them against each other for specific use cases.
-
Turning to Open Source: When seeking open-source alternatives, suggestions included llava, cogvlm, mPlug-DocOwl, and donut, with mPlug-DocOwl being specifically recommended for DocVQA use cases.
-
Exploring Supersizing LLMs: A member shared a blog post discussing the use of LLMs beyond token sequencing, emphasizing the need for models that perform complex reasoning and fetch accurate, topical information.
Link mentioned: The Normal Blog - Infinite Context LLMs: Going Beyond RAG with Extended Minds: In this blog we discuss how the transformer architecture naturally extends over external memories, and share empirical results which leverage this capability to succeed where RAG has struggled. These ā¦
Nous Research AI ā· #world-sim (159 messagesš„š„):
-
World-Sim Anticipation Builds: Members express excitement and impatience as World-Simās return is discussed with speculative launch times, the conceptās philosophical underpinnings, and whether AI aspires to godhood. A member provided the link to the Nous Research blog post to delve deeper into this topic: Divinity in AI.
-
Jailbroken Prometheus Draws Interest: The chat mentions an alternative to World-Sim, web-based Jailbroken Prometheus, sparking curiosity among users. For those looking for similar experiences, a member shared a Websim link.
-
Official Confirmation Raises Hype: The anticipation peaks as an official statement is madeāWorld-Sim alongside Nous World Client returns the next day. Users celebrate with excitement and share gifs like Let Me In!.
-
Hetetic Modelling Choices and Payment Options: Inquiries about Claude 3 use and the possibility of switching models in World-Sim get addressed. A member mentioned that users would have model preferences based on affordability and confirmed various subscription and payment options, including an unlimited Claude Opus.
-
Developer Mode and World Client Queries Answered: Discussions sprout around potential features, such as ādeveloper mode,ā and clarifications on the Nous World Client, which will be web-based for accessibility from any device.
Links mentioned:
- world_sim: no description found
- Anime Excited GIF - Anime Excited Happy - Discover & Share GIFs: Click to view the GIF
- Poe Path Of Exile GIF - Poe Path Of Exile Login - Discover & Share GIFs: Click to view the GIF
- Let Me In Crazy GIF - Let Me In Crazy Funny - Discover & Share GIFs: Click to view the GIF
- Tree Fiddy GIF - Tree Fiddy South - Discover & Share GIFs: Click to view the GIF
- Noita Explosion GIF - Noita Explosion Electricity - Discover & Share GIFs: Click to view the GIF
- Noita Game GIF - Noita Game Homing - Discover & Share GIFs: Click to view the GIF
- Youre Not Gonna Like This Jerrod Carmichael GIF - Youre Not Gonna Like This Jerrod Carmichael Saturday Night Live - Discover & Share GIFs: Click to view the GIF
- Every Vault in the Fallout Series Explained | Fallout Lore: Hello everyone! This video is dedicated to any new Fallout Fans who wish to get into the Fallout games and their lore. I remember when I first became a fan a...
- Jailbroken Prometheus Chat: no description found
- Every Vault in the Fallout Series Explained | Fallout Lore: Hello everyone! This video is dedicated to any new Fallout Fans who wish to get into the Fallout games and their lore. I remember when I first became a fan a...
- The Godhood Paradox | Science Fiction Animatic: In a future where the World Simāan online interface powered by advanced AIāallows users to create and manipulate virtual universes, a clash emerges. The Dece...
Perplexity AI ā· #general (286 messagesš„š„):
-
Model Comparisons and Misadventures: Discussions revolve around the performance of various AI models including GPT-4, Claude, and Mistral. Users share experiences suggesting that newer versions at times seem lazier or less capable of managing extensive context, while others note the usefulness of models like Claude 3 Opus for technical issues. Thereās also mentions of Mixtralās 8x22B model being impressive for an open-source release.
-
Channel Guidance and Navigation: New members are guided on how to find related chats and access various channels using the
<id:customize>
feature or by navigating through the Perplexity name at the top of the interface. -
Payment Anxieties and Checkout Changes: Users express confusion and concern over changes to the Perplexity API payment method management and the lack of transparency regarding the remaining pro message counts.
-
File Handling Frustrations: Users discuss the limitations of AI models in handling large context sizes, with one reported difficulty getting a 42k token prompt to properly engage with the system. Another user suggests that the model might be summarizing long documents instead of processing them in detail, impacting how the AI addresses specific prompts.
-
AGI Aspirations and Subscriptions: Conversations feature anticipated updates, with some users eagerly waiting for new features like Grok to be added to Perplexity while others debate over the value of their subscriptions.
Links mentioned:
- Tweet from Bindu Reddy (@bindureddy): The new GPT-4 is amazingly lazy and literally stops after a few turns (back and forth) Itās not very viable in the real world for the moment. Stick to the older version. Comparatively Claude has a l...
- Extended Syntax | Markdown Guide: Advanced features that build on the basic Markdown syntax.
- mistralai/Mixtral-8x22B-Instruct-v0.1 Ā· Hugging Face: no description found
- mistralai/Mixtral-8x22B-v0.1 Ā· Hugging Face: no description found
- Hamak Chilling GIF - Hamak Chilling Beach - Discover & Share GIFs: Click to view the GIF
- Silent Indian GIF - Silent Indian - Discover & Share GIFs: Click to view the GIF
- How Soon Is Now Smiths GIF - How Soon Is Now Smiths Morrissey - Discover & Share GIFs: Click to view the GIF
- TikTok - Make Your Day: no description found
Perplexity AI ā· #sharing (9 messagesš„):
- Exploring World Voice Day: A link to Perplexityās results for World Voice Day was shared, revealing resources and discussions related to this event.
- Delving into AWS Hardening Guide: A user referenced a search for AWS hardening guide, pointing to Perplexity AIās aggregated information on enhancing security on AWS.
- Discovering āSBK Borderlineā: The song āSBK Borderlineā was the focus of a link, facilitating exploration through Perplexityās summarized content.
- Curiosity about Income: A search about income queries was signaled through a Perplexity AI link, encapsulating associated answers and data points.
- Investigating Reboot for Better Performance: Discussion included a practical approach for enhancing an iPadās performance, as a user considered rebooting as illustrated in the given Perplexity link.
Perplexity AI ā· #pplx-api (4 messages):
- Seeking API and Web Client Consistency: A member expressed difficulty in aligning the behavior of the web client with the API, noting occasional discrepancies and seeking to understand specific settings such as temperature to ensure consistency.
- Navigating with Site Search Operator: In reference to locating information, a member suggested using the site search operator
site:URL
to facilitate searches on a specific website. - Rate Limit Counter as a Feature Request: A user proposed having the Perplexity API include the number of requests used within a minute in the response data, to better handle rate limits and potentially wait until the limit resets.
- Querying API Rate Limiting Mechanism: Another member questioned whether the Perplexity API returns a
429 response
when the rate limit is reached, indicating a need for clarity on how the API communicates with users about rate limits.
LAION ā· #general (285 messagesš„š„):
-
PyTorch Design Mysteries: Members express confusion about the design philosophy of PyTorch, noting it often abstracts away many details with ājust one line of code,ā which can prove challenging when something doesnāt work as expected.
-
Storing Large Datasets with Zarr: A discussion about using zarr or other libraries to store large datasets for fast loading, specifically for a 150 GB MRI image dataset. One member raises concerns about whether zarr would attempt to load the entire dataset into RAM.
-
British Law Criminalizing Creation of Certain Images: There is a wrinkle in UK law criminalizing the creation of images with the intent to cause distress, and members debate the enforceability of such a law, especially since proving intent can be challenging.
-
Mysteries of Running AI Inference: A member voices the need for access to actual inference settings to judge AI models properly, like adjusting CFG or hooking models up to suitable ODE solvers instead of just using Eulerās method.
-
The Fate of SAIās Cascade Team and Channels: Itās mentioned that the Cascade team has left Stability AI (SAI), with the related Discord channel being removed, and thereās speculation about the possible involvement of team members with another company, Leonardo, or remaining affiliated with SAI.
Links mentioned:
- no title found: no description found
- Creating sexually explicit deepfakes to become a criminal offence: A new law will see creators of sexually explicit deepfakes face prosecution and a fine.
- ptx0/terminus-xl-velocity-v2 Ā· Hugging Face: no description found
- Perturbed-Attention Guidance SDXL - a Hugging Face Space by multimodalart: no description found
- zero-gpu-explorers (ZeroGPU Explorers): no description found
- Minority Report Leave GIF - Minority Report Leave Walk Away - Discover & Share GIFs: Click to view the GIF
- Reddit - Dive into anything: no description found
- Loss weighting MLP prototype: Loss weighting MLP prototype. GitHub Gist: instantly share code, notes, and snippets.
- Login ⢠Instagram: no description found
- Login ⢠Instagram: no description found
- Login ⢠Instagram: no description found
LAION ā· #research (13 messagesš„):
-
Introducing ALERT Safety Benchmark: A new safety benchmark for assessing Large Language Models has been established, complete with a safety Dataset of Problematic Outputs (DPO) set. All interested can access and use it via GitHub - Babelscape/ALERT.
-
Exploring Generative Multimodal Content: An Arxiv paper discussing the generation of audio from text prompts and how focusing on the presence of concepts or events could improve performance, has been shared. View the research on arXiv.
-
Debate over AI Safety Standards: Members discussed the terminology and standards of āsafetyā in AI, debating whether restricting AI to non-controversial or PG content might limit its creative capacities compared to other artistic tools.
-
Comparing GANs with Diffusion Models: A discussion unfolded around the benefits of GANs over diffusion models. Mentioned advantages include faster inference times, smaller parameter counts, feedback from discriminators, and potentially lower costs for training.
-
Skepticism Over GANsā Image Quality and Training Difficulty: Despite some perceived benefits, GANs were criticized for reportedly producing inferior images as judged by human discrimination and presenting challenges in training compared to diffusion models.
Links mentioned:
- Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization: Generative multimodal content is increasingly prevalent in much of the content creation arena, as it has the potential to allow artists and media personnel to create pre-production mockups by quickly ...
- GitHub - Babelscape/ALERT: Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Modelsā Safety through Red Teaming": Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Modelsā Safety through Red Teaming" - Babelscape/ALERT
OpenRouter (Alex Atallah) ā· #announcements (5 messages):
-
New Models and Price Adjustments: OpenRouter announces the availability of WizardLM-2 7B and a price reduction for WizardLM-2 8x22B to $0.65/M tokens. Discussions about these models can be followed in their dedicated channel.
-
Latency Issues Under Investigation: OpenRouter is investigating high latencies for Mistral 7B Instruct and Mixtral 8x7B Instruct with ongoing discussions in a message thread. The cause was initially tied to a cloud providerās DDoS protection but is now resolved.
-
Third-party Problems Affecting Services: An update revealed reoccurring high latency issues affecting Nous Capybara 34b among others, potentially due to a specific cloud provider. Updates continued as the situation developed, with traffic returning to normal and further deep investigation with providers.
-
Maintenance Notice: Users were informed of an impending DB reboot expected to briefly take the site offline.
-
Launch of High-Throughput Model and Status Update: The WizardLM-2 8x22B Nitro model is now serving over 100 transactions per second with a notice that the DB restart was completed. The team continues to address performance issues, with updates and discussions available in channel.
Links mentioned:
- WizardLM-2 8x22B by microsoft | OpenRouter: WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing ...
- WizardLM-2 7B by microsoft | OpenRouter: WizardLM-2 7B is the smaller variant of Microsoft AI's latest Wizard model. It is the fastest and achieves comparable performance with existing 10x larger opensource leading models It is a finet...
- WizardLM-2 8x22B by microsoft | OpenRouter: WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing ...
OpenRouter (Alex Atallah) ā· #app-showcase (4 messages):
- Help Wanted for AI Frontend Project: A member is seeking a web developer to assist with a project focused on a general-purpose AI frontend for OpenRouter, which has a role-playing orientation. Theyāve managed to get the novel mode working but are struggling with the conversation style mode.
- Assistance Requested for Distinguishing AI Text: They are also looking to enhance the novel mode by creating a way to differentiate between text generated by the AI and the userās own written text.
- Development Support Sought for Sidebar and Modal System: The member needs help to improve a sidebar with options and is looking to develop a flexible modal system for their application.
OpenRouter (Alex Atallah) ā· #general (271 messagesš„š„):
-
Censorship Layers and NSFW Content Management in AI Models: Discussions touched on the layers of censorship within a particular AI model, and a member noted that their experiences with NSFW content on their end were very explicit. Another member questioned the usefulness of a base model for their purposes.
-
Interest in Multilingual Capacity of AI Models: The multilingual performance of WizardLM was critiqued with a member suggesting it might be undertrained for non-English languages. There was speculation on whether upcoming models could surpass 8x7b models in performance and pricing.
-
Server Issues and Latency Concerns: Members experienced issues with high latency and server errors, noting particularly long response times. Updates on investigating and resolving the server issues were provided, with a focus on fixing core server problems before adding new models such as Leptonās Wizard 8x22b.
-
Decoding Algorithm Impact on AI Model Quality: Discussion about quantization of models to bits per word (bpw) revealed preferences for 6 or at least 5 bpw over 4 bpw, with some noting that a noticeable quality loss occurs with lower bpw.
-
Potential New Additions and Deployments of AI Models: The OpenRouter team indicated that new models such as Mistral 8x22B Instruct were being deployed. Concerns about the reliability of certain providers like TogetherAI were expressed, with members looking forward to direct endpoints from Mistral and the addition of Fireworks as a provider.
Links mentioned:
- Cheaper, Better, Faster, Stronger: Continuing to push the frontier of AI and making it accessible to all.
- Robot GIF - Find & Share on GIPHY: Discover & share this Robot GIF with everyone you know. GIPHY is how you search, share, discover, and create GIFs.
- mistralai/Mixtral-8x22B-Instruct-v0.1 Ā· Hugging Face: no description found
- Chiquichico GIF - Chiquichico - Discover & Share GIFs: Click to view the GIF
- All New Atlas | Boston Dynamics: We are unveiling the next generation of humanoid robotsāa fully electric Atlas robot designed for real-world applications. The new Atlas builds on decades of...
- WizardLM-2 8x22B by microsoft | OpenRouter: WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing ...
Modular (Mojo š„) ā· #general (67 messagesš„š„):
-
Insights on Mojoās Compile-Time Optimizations: Members discussed the optimization efficiency of Mojo, mentioning that aliases like
@parameter
are determined at compile time, leading to memory and processing efficiency by avoiding the need to reserve memory for the alias after its purpose is served. This conversation was sparked by thoughts on the importance of readable code over comments, as discussed in a YouTube video titled āDonāt Write Commentsā. -
Exploring Typestates in Rust Programming: The conversation shifted towards best practices in API design, with one member favoring the use of typestates and lifetimes for making static guarantees in programming, sharing a Rust typestate pattern article for reference.
-
Contemplation on Memory Allocation and Optimization: A debate unfolded about whether variables could be optimized in the same way as aliases in Mojo, touching upon optimization concerns in Rust and the potential for memory-efficient data structures such as bit vectors.
-
Issues Adapting Code to Mojo Version 24.2: Conversation occurred around upgrading the llama2.mojo code to be compatible with Mojo version 24.2, specifically the need for pointer type conversions. Solutions using
DTypePointer
were offered to address issues withAnyPointer
conversion. -
Mojo Development and IDE Integration Discussion: Members discussed the structure of Mojo projects and whether there is a similar package management system to Rustās Cargo. Additionally, the availability of a Mojo plugin for IDEs such as PyCharm was mentioned, with reference to the plugin link, and the JetBrains teamās interest in further Mojo support.
Links mentioned:
- Mojo - IntelliJ IDEs Plugin | Marketplace: Provides basic editing for Mojo programming language: syntax checks and highlighting, commenting and formatting. New features will be added in the future, please feel...
- Packed structs in Zig make bit/flag sets trivial: As we've been building Mach engine, we've been using a neat little pattern in Zig that enables writing flag sets more nicely in Zig than in other languages. Here's a brief explainer.
- Bamboozled GIF - Bamboozled - Discover & Share GIFs: Click to view the GIF
- Analyzing Data 180,000x Faster with Rust: How to hash, index, profile, multi-thread, and SIMD your way to incredible speeds.
- The Typestate Pattern in Rust - Cliffle : no description found
- Don't Write Comments: Why you shouldn't write comments in your code (write documentation)Access to code examples, discord, song names and more at https://www.patreon.com/codeaesth...
Modular (Mojo š„) ā· #š¬ļø±twitter (1 messages):
ModularBot: From Modular: https://twitter.com/Modular/status/1780676643176231240
Modular (Mojo š„) ā· #ai (2 messages):
- Replication Curiosity in Modular: A member expressed interest in replicating a concept or project within the Mojo platform, indicating anticipation for potential outcomes.
- Guidance on AI Long-Term Memory and Self-Improvement: A video tutorial was shared by a member explaining how to build an AI agent with long-term memory and self-improvement capabilities, intended to be a helpful resource. The video, titled āUnlock AI Agent real power?! Long term memory & Self improving,ā is available on YouTube.
Link mentioned: Unlock AI Agent real power?! Long term memory & Self improving: How to build Long term memory & Self improving ability into your AI Agent?Use AI Slide deck builder Gamma for free: https://gamma.app/?utm_source=youtube&utmā¦
Modular (Mojo š„) ā· #š„mojo (136 messagesš„š„):
- New Python Package for Mojo to Python Code: A new python package called mojo2py has been announced that converts Mojo code into Python code.
- Need for a Comprehensive Mojo Learning Resource: A member is seeking a comprehensive resource for learning Mojo from scratch, and was directed to the Mojo programming manual, which covers fundamental concepts such as parameters vs. arguments, the ASAP concept, types and traits, and key re-reading sections like owned arguments and transfer operator.
- Struct Inheritance and Code Reusability: Discussions circled around the desire for some form of inheritance within Mojo, with suggestions for reducing boilerplate and instances where a child struct could be created from a parent struct. While one approach suggested was using traits for type declarations, another member clarified that if one seeks compile-time optimization, classes might be more suitable, versus runtime-based approaches.
- Start of Conditional Conformance in Mojo: There appears to be movement towards implementing conditional conformance in Mojo, as evidenced by recent discussion and code snippets shared amongst members. The dialogue involved understanding how conditional conformance might be leveraged to make standard library functions like
str
andprint
work for different Mojo data structures. - Challenges and Prospects of Advanced Type Systems: Intense technical debate and brainstorming emerged around creating a numpy-style Mojo library that enforces shape compatibility at compile time, the potential for supporting
Variant
data structures without runtime checks, and addressing the specific issue of storing multiple variants in a single list. Various approaches were proposed and conceptually dissected, including custom structs, enum parameters, and challenges in implementing generics and shape refinement for parametric code.
Links mentioned:
- Mojo Manual | Modular Docs: A comprehensive guide to the Mojo programming language.
- Protocol-Oriented Programming in Swift / WWDC15 / Session 408: At the heart of Swift's design are two incredibly powerful ideas: protocol-oriented programming and first class value semantics. Each of these concepts benef...
- GitHub - venvis/mojo2py: A python package to convert mojo code into python code: A python package to convert mojo code into python code - venvis/mojo2py
Modular (Mojo š„) ā· #community-projects (10 messagesš„):
- Sudden Sketch Success: A community member shared an āoff the cuffā programming sketch implemented in Mojo, found to be surprisingly effective, accessible via this gist.
- Anticipating Enhanced Tuple Capabilities: Upcoming enhancements could allow
Tuple
in Mojo to take traits derived fromCollectionElement
, leading to more elegant struct definitions for HTML rendering. - Nightly Features in Play: It was clarified that the shared code uses nightly features, which may cause compilation errors on the current Mojo 24.2 and on the Mojo Playground.
- Canny Edge Recognition Challenge: A new community member from France, experienced in Numba with Python, expressed interest in implementing the Canny edge recognition algorithm in Mojo to compare performance.
- Mojo Resources for Newcomers: A helpful response to a project inquiry included links to the Mojo documentation, guidance on getting started with the language, and referenced available resources such as the Mojo SDK and Mojo Playground.
Links mentioned:
- Get started with Mojoš„ | Modular Docs: Get the Mojo SDK or try coding in the Mojo Playground.
- Mojoš„ | Modular Docs: A programming language that bridges the gap between AI research and production, unlocking speed and usability.
- Mojoš„ notebooks | Modular Docs: All the Jupyter notebooks we've created for the Mojo Playground.
- html.mojo: GitHub Gist: instantly share code, notes, and snippets.
Modular (Mojo š„) ā· #community-blogs-vids (1 messages):
- Exploring the Hype Around Mojo: A recent talk titled āMaxim Zaks - Is Mojo just a hype?ā from PyCon Lithuania has been released on YouTube, which prompts a discussion on the Modular chatbotās place in the industry.
Link mentioned: Maxim Zaks - Is Mojo just a hype?: no description found
Modular (Mojo š„) ā· #š°ļø±newsletter (1 messages):
Zapier: Modverse Weekly - Issue 30 https://www.modular.com/newsletters/modverse-weekly-30
Modular (Mojo š„) ā· #šengine (1 messages):
There was only one message provided with no mention of any discussion points, topics, or links to summarize. If you would like a summary of a more extensive conversation or a specific topic within the šengine channel, please provide the relevant messages.
Modular (Mojo š„) ā· #nightly (21 messagesš„):
- A New Nightly Mojo: Updates and Changes: A new nightly update for Mojo has been released, complete with updates to the standard library and a detailed diff available, as well as a changelog documenting the changes since the last stable release found here.
- A Love for Unconventional Code: Members reacted humorously to unconventional code styling, with comments indicating affection for its āhorribleā appearance and a comical plea to indent for loops for readability.
- Peer Pressure vs. Code Formatting Practices: One voice suggested holding off on conforming to peer pressure regarding code indentation practices, but another opined the inevitability of adopting Mojo formatting standards.
- Nightly update causes confusion: The new nightly update led to confusion for a user over function overloads parameterized on traits, resulting in unexpected errors and discussions around finding a solution.
- Traits Over Janky Workarounds and Clean-Up Releases: Discussion included a slight jest on the preference for using ājankā over proper trait parameterization and comments on the recent clean-up efforts in the latest Mojo nightly release.
Links mentioned:
- [stdlib] Update stdlib corresponding to `2024-04-16` nightly/mojo by patrickdoc Ā· Pull Request #2313 Ā· modularml/mojo: This updates the stdlib with the internal commits corresponding to today's nightly release: mojo 2024.4.1618 . In the future, we may push these updates directly to nightly branch.
- mojo/docs/changelog.md at nightly Ā· modularml/mojo: The Mojo Programming Language. Contribute to modularml/mojo development by creating an account on GitHub.
CUDA MODE ā· #general (11 messagesš„):
-
Seeking Guidance in PyTorch: A member asked if āDeep Learning with PyTorchā is still a good starting point given that it was published 4 years ago. Another member confirmed that while PyTorchās core hasnāt changed much, there have been significant updates in the compiler and distributed systems.
-
PyTorch Evolution and New Edition Tease: Updates were discussed clarifying that the book does not cover topics like transformers and LLMs, and that while parts I and II remain useful, part III on deployment is outdated. It was also revealed that a new edition is in progress, spearheaded by a new author.
-
Anticipating Blog Content: A member mentioned they had a draft chapter on attention/transformers and considered creating a blog post from it.
Link mentioned: Deep Learning with PyTorch, Second Edition: Everything you need to create neural networks with PyTorch, including Large Language and diffusion models.</b>
Deep Learning with PyTorch, Second Edition</i> updates the bestselling oriā¦
CUDA MODE ā· #cuda (20 messagesš„):
- Accelerated Matrix Operations in CUDA: A member discussed the integration of a new fp16 precision general matrix-matrix multiplication (GEMM) implementation for CUDA, which outperforms PyTorchās GEMM function in a specific matrix operation benchmark (MxNxK = 1x4096x4096).
- Challenges with JIT Compilation: Despite the new implementation providing a performance boost, another member noted it fails with
torch.compile
; sharing crash details with uncompiled token generation at 11.17 tokens/sec versus compiled token generation at 64.4 tokens/sec before the crash due to an unsupported method call related to āblock_dim_xā. - Block Size Parameters Exploration: Discussion continued around the choice of block sizes in the new GEMM kernel, with members examining the use of a 32x4 effective block size, discovering it seemed to yield better performance and sharing their observations in a related Gist example.
- Inquiry about Data Reading for CUDA C++: A member sought advice on reading large datasets in CSV or Parquet formats within CUDA C++ applications, pondering the possibility of parallel execution but without offering a specific solution.
- Speculating on CUDA Cores and Thread Dispatch: Further technical speculation highlighted the probable connection between faster kernel performance and the use of 128 total active threads per streaming multiprocessor, considering the dispatch of 32 threads per clock cycle across 4 warps.
Links mentioned:
- torch-cublas-hgemm/src/simt_hgemv.cu at master Ā· aredden/torch-cublas-hgemm: PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu - aredden/torch-cublas-hgemm
- zippy_gemv_hqq_gen.py: GitHub Gist: instantly share code, notes, and snippets.
CUDA MODE ā· #torch (2 messages):
- Searching for the F.Linear Implementation: A member is working on a custom backward function that performs correctly with a (bs, data_dim) input similar to F.Linear. They encountered issues when integrating with Llama due to input dimension differences and are now seeking the forward/backward implementation of
F.Linear
, which was elusive in the indicated tools/autograd/templates/python_nn_functions.cpp.
CUDA MODE ā· #cool-links (2 messages):
-
Augmend Launches Video Processing Tool: Augmend offers a work-in-progress feature on wip.augmend.us for analyzing videos, with a smart addition of OCR and image segmentation to extract information directly from video screens. The completed service will be available on augmend.com, allowing users to copy/paste and search content within any video.
-
Boston Dynamics Reveals Electric Atlas Robot: Boston Dynamics released a YouTube video on a next-generation humanoid robot named Atlas; the All New Atlas | Boston Dynamics video presents a fully electric robot aimed at real-world applications and highlights advances over decades of robotic development.
Link mentioned: All New Atlas | Boston Dynamics: We are unveiling the next generation of humanoid robotsāa fully electric Atlas robot designed for real-world applications. The new Atlas builds on decades ofā¦
CUDA MODE ā· #beginner (43 messagesš„):
-
Newcomer Inquiry on PMPP Lectures: A newcomer inquired about the routine meeting schedule for going through pmpp lectures. Recorded lectures can be found in a specific channel, with the last covered chapter being the 10th.
-
WSL Profiling Troubles: A user expressed difficulty running the ncu profiler on WSL, suspecting a PATH issue, and highlighted that NSight Compute on Windows was conflicting with WSL. Despite having nsight-compute installed, the
ncu
command was not found. -
Cuda Toolkit PATH Adjustment Suggestions: Users suggested several troubleshooting steps, focusing on adding the correct CUDA path to the environment variables. One user provided a link to NVIDIAās documentation to assist with setting environment variables on Windows.
-
Version Mismatch Discovered: It was discovered that there was a version mismatch, with the userās environment configured for CUDA 12.4 while attempting to run
ncu
from CUDA version 11.5. Adding the path didnāt immediately resolve the issue. -
Windows 11 Recommended for WSL 2 Profiling: Another user mentioned needing Windows 11 to profile CUDA programs on WSL 2 effectively, sharing a helpful blog post detailing how to set up the system and resolve common issues.
Links mentioned:
- Environment Variables: no description found
- Profiling CUDA programs on WSL 2: no description found
CUDA MODE ā· #youtube-recordings (1 messages):
marksaroufim: https://www.youtube.com/watch?v=DdTsX6DQk24
CUDA MODE ā· #ring-attention (5 messages):
- RingAttention Working Group Conundrum: A key member revealed that they cannot commit to working on the RingAttention project alongside their main job due to time constraints. They proposed a discussion to decide whether others will continue the initiative or temporarily conclude this working-group effort.
- Decisive Discussion Scheduled: A meeting was scheduled to discuss the future of the RingAttention project and who might continue its development.
- A Time for Difficult Choices: The member expressed regret over their decision to step back from RingAttention, emphasizing that the choice was made with heavy consideration of personal time and well-being.
- Participants Ready for the Talk: Team members confirmed their availability and showed readiness to join the forthcoming discussion about the future of RingAttention.
- Pre-Meeting Preparations: One of the members notified others that they would join the meeting shortly, indicating active preparation for the scheduled discussion.
CUDA MODE ā· #hqq (36 messagesš„):
-
Quandaries About Quantization Axes: Quantizing with
axis=0
for GPTās Q, K, V was found problematic ingpt-fast
due to mixing of parameters during quantization. An ongoing discussion suggests quantizing Q, K, and V separately might be a solution, noting thatweight_int4pack_mm
currently only supportsaxis=1
. -
Speed Versus Quality Compromises in HQQ: The trade-offs between speed and quality when using
axis=0
oraxis=1
in Half-Quadratic Quantization (HQQ) were explored. A member reported equivalent performance of 5.375 perplexity for both axes ongpt-fast
. -
Pursuing Further Optimizations: A mention of using Triton kernels and other methods like fake data to optimize performance along
axis=1
. They noted that method using autograd and randomly generated data gave slightly better results (5.3311 ppl) than HQQ with more iterations. -
Exploring Extended Capabilities and Demystifying Differences: Insights into the potential impact of in-channel variation on weight quantization accuracy were shared, referring to steeling quants with
axis=0
appearing to yield better results. The conversation indicated that HQQ effectively finds optimal solutions faster compared to lengthy autograd optimization. -
Implementational Details and Benchmarks Shared: Links were provided to the implementation details, such as a torch int4mm demo with transformers as well as the optimizer code using autograd and discussions were centered around potentially speeding up operations further with vectorized fp16 multiplication and the practicality of lower precision quantization like 2/3 bits.
Links mentioned:
- zhxch (zhongxiaochao): no description found
- hqq/hqq/core/quantize.py at 63cc6c0bbb33da9a42c330ae59b509c75ac2ce15 Ā· mobiusml/hqq: Official implementation of Half-Quadratic Quantization (HQQ) - mobiusml/hqq
- zhxchen17/scratch at main: no description found
- hqq/hqq/kernels/hqq_aten_cuda_kernel.cu at master Ā· mobiusml/hqq: Official implementation of Half-Quadratic Quantization (HQQ) - mobiusml/hqq
- GitHub - wangsiping97/FastGEMV: High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.: High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline. - GitHub - wangsiping97/FastGEMV: High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.
- hqq/examples/backends/torchao_int4_demo.py at master Ā· mobiusml/hqq: Official implementation of Half-Quadratic Quantization (HQQ) - mobiusml/hqq
- hqq/hqq/core/optimize.py at master Ā· mobiusml/hqq: Official implementation of Half-Quadratic Quantization (HQQ) - mobiusml/hqq
CUDA MODE ā· #llmdotc (76 messagesš„š„):
-
Thunderās CUDA Python Extension Takes Flight: The GitHub notebook for extending PyTorch with CUDA Python receives attention for improving speed, though the integration into cuda-mode and further optimizations such as leveraging tensor cores are still needed for maximum performance.
-
Optimizing Multiplication in Transformers: Members identified the final matmul layer and softmax as significant contributors to computational cost in profiling efforts. An optimised classifier kernel presents an opportunity for improving speed, as seen in the conversation about caching strategy and kernel optimization.
-
Increasing Efficiency of Softmax and Backpropagation: There was discussion about avoiding the materialization of the full probability matrix, focusing instead on necessary token probabilities. A GitHub pull request #117 demonstrates efforts to fuse points in the classification layer.
-
Cache Utilization and Performance Correlation: The effect of block sizes on cache hit rates was discussed, revealing that larger blocks may result in better cache utilization. This insight, embodied in an optimised CUDA kernel, might lead to better performance on GPUs with sufficient cache.
-
Supporting Diverse Model Architectures for Benchmarking: It was suggested to consider the initialization of a variety of GPT model architectures for benchmarking to prevent overfitting optimizations to a single model type. An emphasis was placed on accurately reproducing models like GPT-2 to evaluate performance enhancements meaningfully.
Links mentioned:
- lightning-thunder/notebooks/extend_thunder_with_cuda_python.ipynb at main Ā· Lightning-AI/lightning-thunder: Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs. - Lightning-AI/ligh...
- cutlass/media/docs/quickstart.md at main Ā· NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines. Contribute to NVIDIA/cutlass development by creating an account on GitHub.
- Optimised version of fused classifier + bugfixes(?) by ademeure Ā· Pull Request #150 Ā· karpathy/llm.c: This is a faster version of the cool new kernel from #117 (still /dev/cuda/ only). The biggest difference is it is optimised for doing one row per 1024-wide block rather than per 32-wide warp, whic...
- WIP: Fully fused classification layer by ngc92 Ā· Pull Request #117 Ā· karpathy/llm.c: This fuses together all the pointwise operations that happen in the token classification layer. This essentially gives us the forward/backward for the cost of about just the forward pass, because t...
- SlimPajama: A 627B token, cleaned and deduplicated version of RedPajama - Cerebras: Cerebras has built a platform for push-button training of large language models that can accelerate time to insights without having to orchestrate across a large cluster of small devices.
- The MiniPile Challenge for Data-Efficient Language Models: The ever-growing diversity of pre-training text corpora has equipped language models with generalization capabilities across various downstream tasks. However, such diverse datasets are often too larg...
- GitHub - tysam-code/hlb-gpt: Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).: Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 secon...
CUDA MODE ā· #massively-parallel-crew (14 messagesš„):
-
Tablet Triumph for Presentations: A member pondered the possibility of using an iPad to switch between slides and live writing for presentations. The consensus suggested using a single device for both tasks and emphasized the importance of testing the setup beforehand to ensure a smooth experience.
-
No to NSFW: With incidents of inappropriate content being posted in the chat, members discussed implementing a Discord bot to detect and prevent such content from being shared, with suggestions of banning offenders or restricting their typing privileges.
-
Event Creation Empowerment: Itās been announced that everyone now has the roles and privileges to create new events on the server. This change empowers members to organize their own gatherings and discussions.
-
Interjections and Interactions: Casual interactions among members included humorous suggestions for names like āMassively Helpfulā and playing with the word āparallelā in the context of the server name. These moments reflect the lighter side of the communityās interactions.
-
Tech Tips Shared: Thereās helpful advice given for someone wishing to stream presentations, including using a Wacom tablet and maintaining audience engagement through methods like different setups. The importance of testing the setup early was highlighted once again.
OpenAI ā· #ai-discussions (167 messagesš„š„):
-
Gaming Assistant Development Inquiry: A user sought advice on creating a gaming assistant combining GPT-Vision, camera input, and probabilistic calculations for real-time multiple-choice games. Considering using Azure or a virtual machine for running demanding calculation software was suggested, with TensorFlow or OpenCV as possible tools to manage the system.
-
AI vs. Human Cognition Debate: The channel hosted a philosophical discussion on the fundamental differences between AI and humans, touching on concepts such as memory storage, computational power, and the potential for AI to develop human-like reasoning and emotions with advancements like quantum computing.
-
Understanding Non-Binary Thinking: There was an extensive debate on binary versus non-binary thinking, with users discussing the applicability of binary thinking and labels in humans and AI, and how gradients and chaos theory might present a more accurate model of cognition and decision-making.
-
Claudeās Superiority for Literature Reviews: Users exchanged opinions on suitable AI models for writing literature reviews, with advice given to use Claude over OpenAI for non-technical literary tasks, and mentioning Gemini 1.5 for aiding in writing fictional works.
-
Navigating AI-Related Complications: Participants reported and discussed issues such as unexpected account terminations and policy violations, highlighting challenges in understanding and adhering to the usage policies of AI platforms, and expressing concerns about the lack of clarity and support often encountered.
Links mentioned:
- Disneyland Paris GIF - Disneyland Paris Parks - Discover & Share GIFs: Click to view the GIF
- Turing completeness - Wikipedia: no description found
OpenAI ā· #gpt-4-discussions (7 messages):
- GPT Gets a Trim?: A user remarked that it seems like GPT was significantly altered or ālobotomised,ā while another defended the new GPT-4 Turbo as being effective, mentioning alternate endpoints to use.
- Important to Report Flaws: One member encouraged others to report any problematic messages from GPT to improve its performance.
- Discussing Alternatives Due to Costs: A user shared that they are using Gemini 1.5 with a 1 million token context window on Google Studio as an alternative, implying costs are a factor.
- Seeking Knowledge Base Training: Someone asked for directions to trainings or resources on how to prepare a knowledge base for a custom GPT.
- Whispering for Whisper v3 API Access: A query was raised about when Whisper v3 would become available through the API, noting that it has been almost a year since its release.
- Shrinking Token Attention Span?: A user observed that GPT-4ās ability to remember past inputs seems impaired, speculating about a reduced token limit from beyond 30,000.
OpenAI ā· #prompt-engineering (5 messages):
- Echoes in the Ghost Town: One member laments the decline of activity in the prompt-engineering channel, attributing the lack of discussion to over-moderation by administrators and mods.
- Salty Retrospection: A user suggests their extended timeout from the server may be related to a decline in activity, and believes others may have faced similar penalties.
- GPT-4-Turboās Math Prowess: GPT-4-TURBO successfully solved a math problem regarding the number of possible seating arrangements for the Smith family at their dinner table.
OpenAI ā· #api-discussions (5 messages):
- Silence in the OpenAI Discord: One member expressed dismay at the lack of recent activity within the api-discussions channel, noting it has been quiet for weeks.
- Reflections on Server Moderation: The same member attributed the inactivity to what they perceived as over-moderation by the serverās administrators.
- Post-Timeout Frustrations: Following a 5-month timeout from the server, the member lamented that they were punished for attempting to assist another user.
- GPT-4-Turboās Mathematical Prowess: A user reported that GPT-4-TURBO correctly solved a combinatorial math problem involving the seating arrangements of the Smith family at a dinner table.
LlamaIndex ā· #blog (3 messages):
-
Qdrant Hybrid Cloud Offering Launch: The @qdrant_engine has launched a hybrid cloud offering, enabling running Qdrant as a hosted service, at the edge, or in oneās own environment while maintaining full data control. The announcement also linked to an in-depth tutorial on setting it up.
-
LlamaIndex Teams Up with Azure AI Search: A tutorial presented by Khye Wei from Microsoft demonstrates how to combine LlamaIndex with Azure AI Search to create enhanced RAG applications that feature Hybrid Search and Query rewriting.
-
Day 0 Support for MistralAIās Latest Model: MistralAIās new 8x22b model, described as defining the state of the art in open models, is supported by LlamaIndex from day one. The release includes a Mistral cookbook by @ravithejads, showcasing RAG, Query routing, and Tool use.
Link mentioned: MistralAI Cookbook - LlamaIndex: no description found
LlamaIndex ā· #general (164 messagesš„š„):
-
Inquiry About Building a Search Engine: Users discussed how to build a search engine using LlamaIndex. One user provided a starter tutorial and highlighted the use of
retriever
with a highertop_k
value to retrieve top documents. -
Understanding LLM Retrieval Limits: A user clarified they needed to retrieve document names instead of answers from agents, comparing it to perplexity functionality. The conversation continued with users referencing LlamaIndexās
retriever
and its settings. -
Issues With Authentication: Several users encountered and discussed errors related to API authentication. The error messages indicated incorrect API keys, leading to troubleshooting around environment variables and correct key usage.
-
LLamaIndex Updates And Issue Fixing: Users collaboratively tried to resolve various issues, with a specific focus on a
BaseComponent
error which one user couldnāt resolve despite trying numerous troubleshooting steps. A solution was suggested in the form of a GitHub pull request. -
LLM Query Logging and Active Model Check: Discussion on logging within LlamaIndex led to advising on adjusting logging levels from
DEBUG
toINFO
. A user sought to confirm which LLM was active for a query and was advised on checking and setting the LLM through theSettings.llm
attribute.
Links mentioned:
- Google Colaboratory: no description found
- Starter Tutorial (OpenAI) - LlamaIndex: no description found
- Langfuse Callback Handler - LlamaIndex: no description found
- ">no title found: no description found
- ">no title found: no description found
- Openai like - LlamaIndex: no description found
- Finetune Embeddings - LlamaIndex: no description found
- create_llama_projects/nextjs-edge-llamaparse at main Ā· run-llama/create_llama_projects: Contribute to run-llama/create_llama_projects development by creating an account on GitHub.
- Multi-Document Agents - LlamaIndex: no description found
- Answer Relevancy and Context Relevancy Evaluations - LlamaIndex: no description found
- Catch validation errors by logan-markewich Ā· Pull Request #12882 Ā· run-llama/llama_index: Some people are experiencing some weird errors here. Lets just catch validation errors to prevent incompatible package versions from crashing core
- LlamaCPP - LlamaIndex: no description found
- Openapi - LlamaIndex: no description found
- Q&A patterns - LlamaIndex: no description found
- Document Summary Index - LlamaIndex: no description found
- Pydantic Tree Summarize - LlamaIndex: no description found
- Index - LlamaIndex: no description found
- Index - LlamaIndex: no description found
- Llm - LlamaIndex: no description found
- Llm - LlamaIndex: no description found
LlamaIndex ā· #ai-discussion (2 messages):
- Seeking Hierarchical Structure Wisdom: A member is looking to construct a parent-child hierarchical structure within the LlamaIndex using ParentDocumentRetriever langchain for a vast number of documents and is requesting guidance.
Eleuther ā· #general (58 messagesš„š„):
-
Pile-T5 Details Sought: A user requested details about the Pile-T5 model on EleutherAIās Discord, pointing to the Hugging Face collection page for further information. The discussion clarified that āsequence lengthā and ācontext windowā are the same, while noting the scarcity of encoder/decoder models with long sequence lengths.
-
Rekaās Long Enc-Dec Model Revealed: In discussing model sequence lengths, a user mentioned Rekaās new encoder-decoder model, which supports up to 128k, as described in their core tech report.
-
EleutherAIās Model Evaluation Harness Discussed: The ARC-challenge on EleutherAIās Evaluation Harness was debated with concerns on the absence of āchoicesā in the query for models. It was mentioned that the library initially aimed to replicate plots from the GPT-3 paper, with intentions to standardize MCQA tasks by offering multiple prompting options.
-
Research Scientist Interview Insights: Users shared insights on research scientist interviews, explaining that the focus can vary greatly depending on the company, ranging from little emphasis on traditional data structure and algorithm questions to heavy consideration of the candidateās talk, papers, and potential for grant acquisition.
-
Sequence Packing vs. Prepacking in LLMs: A discussion emerged about whether āprepackingā is just regular sequence packing, as mentioned in a new research paper. This led to a debate about the novelty and prior documentation of these methods, with references to the T5 paper and upcoming publications addressing these and related methods for model evaluation and efficiency.
Links mentioned:
- Tweet from Siyan Zhao (@siyan_zhao): šØLLM RESEARCHERSšØWant a free boost in speed and memory efficiency for your HuggingFaceš¤LLM with ZERO degradation in generation quality? Introducing Prepacking, a simple method to obtain up to 6x sp...
- Pile-T5 - a EleutherAI Collection: no description found
- Tweet from Sasha Rush (@srush_nlp): Lazy twitter: A common question in NLP class is "if xBERT worked well, why didn't people make it bigger?" but I realize I just don't know the answer. I assume people tried but that a l...
- lintang/pile-t5-base-flan Ā· Hugging Face: no description found
- lintang (Lintang Sutawika): no description found
- lm-evaluation-harness/lm_eval/tasks/arc.py at b281b0921b636bc36ad05c0b0b0763bd6dd43463 Ā· EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness
- lm-evaluation-harness/lm_eval/tasks/hendrycks_test.py at b281b0921b636bc36ad05c0b0b0763bd6dd43463 Ā· EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness
- Models - Hugging Face: no description found
Eleuther ā· #research (78 messagesš„š„):
-
New Transformer Architecture for Long Inputs: A recent proposal for a novel Transformer architecture named Feedback Attention Memory (FAM) aims to enable processing of indefinitely long sequences by allowing the network to attend to its own latent representations, thus overcoming the quadratic attention complexity. FAMās performance showed significant improvement on long-context tasks.
-
Advances in Brain Decoding Research: The paper MindBridge introduces a new approach that allows for cross-subject brain decoding by employing only one model, addressing three main challenges in the field: variability in brain sizes, individual neural pattern differences, and limited data for new subjects.
-
Rethinking Scaling Lawsā Accuracy: Discrepancies pointed out in the compute-optimal scaling laws presented by Hoffmann et al. (2022) highlight the importance of data transparency, as a new analysis suggests that the original narrow confidence intervals were implausible unless an extensive number of experiments were conducted.
-
Expressive Power of State-Space Models: A discussion was prompted by the analysis of State-Space Models (SSMs), revealing that their expressive power for state tracking is very similar to transformers and SSMs cannot express computation beyond the complexity class $\mathsf{TC}^0$. The dialogue also touched upon clarifications and potential misunderstandings from prior related works.
-
Transformers, RL, and EEG Feedback: Conversations touched on the concept of using Reinforcement Learning (RL) with feedback from an EEG but found limited academic research, primarily existing product implementations; the complexities and risks associated with such undertakings were also noted.
Links mentioned:
- The Illusion of State in State-Space Models: State-space models (SSMs) have emerged as a potential alternative architecture for building large language models (LLMs) compared to the previously ubiquitous transformer architecture. One theoretical...
- Chinchilla Scaling: A replication attempt: Hoffmann et al. (2022) propose three methods for estimating a compute-optimal scaling law. We attempt to replicate their third estimation procedure, which involves fitting a parametric loss function t...
- Self-playing Adversarial Language Game Enhances LLM Reasoning: We explore the self-play training procedure of large language models (LLMs) in a two-player adversarial language game called Adversarial Taboo. In this game, an attacker and a defender communicate wit...
- TransformerFAM: Feedback attention is working memory: While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. We propose Feedback Attention Memory (FAM), a novel ...
- VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time: We introduce VASA, a framework for generating lifelike talking faces with appealing visual affective skills (VAS) given a single static image and a speech audio clip. Our premiere model, VASA-1, is ca...
- ReFT: Representation Finetuning for Language Models: Parameter-efficient fine-tuning (PEFT) methods seek to adapt large models via updates to a small number of weights. However, much prior interpretability work has shown that representations encode rich...
- Scaling Instructable Agents Across Many Simulated Worlds: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground langu...
- Scaling Instructable Agents Across Many Simulated Worlds: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground langu...
- Tweet from Will Merrill (@lambdaviking): [1/n] How does a chain of thought change the expressive power of transformers? New work w/ @Ashish_S_AI studies how adding CoT/decoding steps extends the problems solvable by transformers as a fn of ...
- MindBridge: A Cross-Subject Brain Decoding Framework: Brain decoding, a pivotal field in neuroscience, aims to reconstruct stimuli from acquired brain signals, primarily utilizing functional magnetic resonance imaging (fMRI). Currently, brain decoding is...
- Finetuning Pretrained Transformers into RNNs: Transformers have outperformed recurrent neural networks (RNNs) in natural language generation. But this comes with a significant computational cost, as the attention mechanism's complexity scales...
- Transformers Represent Belief State Geometry in their Residual Stream ā LessWrong: Produced while being an affiliate at PIBBSS[1]. The work was done initially with funding from a Lightspeed Grant, and then continued while at PIBBSS.ā¦
Eleuther ā· #scaling-laws (5 messages):
-
Flops Estimation for ML Newcomers: A member sought advice on estimating training flops from the SoundStream paper and was guided to calculate the number of operations per token for both forward and backward passes, using the equation 6 * # of parameters for decoder-only transformers. They were referred to a detailed example in Section 2.1 of a relevant paper.
-
One Epoch Assumption in Cost Estimation: In response to a question about training cost estimation, one member clarified that itās wise to assume a single dataset pass unless a paper explicitly mentions performing multiple epochs.
-
Mystery of Unreported Dataset Size: One member highlighted the difficulty in estimating training cost from a paper, like the SoundStream paper, when details like the size of the training dataset are not disclosed. This poses a challenge in computing accurate cost estimates.
Eleuther ā· #lm-thunderdome (21 messagesš„):
-
Clarifications on Model Evaluation: There was a discussion on how to use
lm-evaluation-harness
for evaluating custom models, specifically for thearc_easy
task, clarifying that one should return a pair (log-likelihood, is_greedy_decoding_equal_target) fromloglikelihood
. It was noted that for tasks like ARC, where there are multiple choices, the likelihood of each combination of question and answers is evaluated, and the one with the highest likelihood deemed the correct answer. -
Understanding BPC as a Metric: A paper was discussed that correlates modelsā intelligence with their ability to compress text, using BPC (bits per character) as a proxy for intelligence. The benefits of considering BPC over loss were debated, with the conclusion that BPC is a unit of information rather than just loss, which aligns it more closely with compression capabilities.
-
Branch Comparisons and Evaluations: There was an inquiry about the improvements in the
big-refactor
branch over the main branch of a project which apparently offers significantly better speed. Also, another user wondered about saving generation results per question usingvllm
and learned that using the--log_samples
flag allows logging individual responses rather than just aggregate scores. -
Leveraging Acceleration Tools for Better Performance: It was suggested that using the
--batch_size
argument oraccelerate launch --no_python lm-eval
could be beneficial when evaluating large models, especially on a pod of 8 A100s, to potentially improve speed and performance. -
Assistance with Model Evaluation Methods: One user had a doubt about the
arc_easy
task always resulting in 0.25 performance when returning random debug values and learned that since ARC has four possible answers and a random selection would result in a roughly 25% correctness rate. It was explained how tasks like MMLU and lambada_openai use the loglikelihood outputs differently to calculate accuracy.
Link mentioned: Tweet from Aran Komatsuzaki (@arankomatsuzaki): Compression Represents Intelligence Linearly LLMsā intelligence ā reflected by average benchmark scores ā almost linearly correlates with their ability to compress external text corpora repo: htā¦
Eleuther ā· #multimodal-general (1 messages):
-
Exploring Multi-Modal Learning: jubei_ shared two papers on arXiv regarding multi-modal machine learning. The first paper proposes an information-theoretic approach named Total Correlation Gain Maximization (TCGM) for semi-supervised multi-modal learning that effectively utilizes unlabeled data across modalities and offers theoretical guarantees.
-
Dive into Semi-Supervised Multi-Modal Fusion: The discussed paper addresses the challenges of labeling large datasets for multi-modal training, and emphasizes on an approach that could improve the efficiency of fusion in semi-supervised settings. Abstract excerpts mentioned offer insights into the promise of the TCGM method for identifying Bayesian classifiers in multi-modal learning scenarios.
Links mentioned:
- Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework: The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite the...
- TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning: Fusing data from multiple modalities provides more information to train machine learning systems. However, it is prohibitively expensive and time-consuming to label each modality with a large amount o...
HuggingFace ā· #announcements (10 messagesš„):
-
IDEFICS-2 Premieres with Superior Multimodal Abilities: IDEFICS-2 is unveiled, touting 8B parameters, Apache 2.0 license, high-resolution image processing up to 980 x 980, and two checkpoints including instruction fine-tuning. This multimodal model excels in tasks such as visual question answering and document retrieval.
-
Chatbot Variant of IDEFICS-2 on the Horizon: The chat-focused variant of IDEFICS-2 is expected to be released in the coming days. The current version is adept in visual question answering and other non-chat tasks, with a chatty version soon to follow.
-
Clever Multimodal Interaction Showcased: An example shared demonstrates IDEFICS-2ās capabilities, seamlessly blending text recognition, color knowledge, and mathematical operations to interpret and manipulate image contents, including solving CAPTCHAs with significant background noise.
Links mentioned:
- Idefics 8b - a Hugging Face Space by HuggingFaceM4: no description found
- Tweet from lunarflu (@lunarflu1): cool multimodal interaction from IDEFICS-2 @huggingface : 1. Detect numbers from image 2. Do math with the number 3. Retrieve background color 4. Remove pigment -> Resulting color 5. Final result: ...
- Introducing Idefics2: A Powerful 8B Vision-Language Model for the community: no description found
- HuggingFaceM4/idefics2-8b Ā· Hugging Face: no description found
- Tweet from Vaibhav (VB) Srivastav (@reach_vb): Idefics 2 x Transformers! š„ Trying out the Idefics 2 8B in the wild. Pretty wild that you can do all this in less than 10 lines of code! Made a quick screencast taking the model out for a spin.. ...
- HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1 Ā· Hugging Face: no description found
- argilla/distilabel-capybara-dpo-7k-binarized Ā· Datasets at Hugging Face: no description found
- Paper page - ORPO: Monolithic Preference Optimization without Reference Model: no description found
- alignment-handbook/recipes/zephyr-141b-A35b at main Ā· huggingface/alignment-handbook: Robust recipes to align language models with human and AI preferences - huggingface/alignment-handbook
- Tweet from Nicolas Patry (@narsilou): Tgi 2.0 is out! -back to fully open source for good (apache 2.0) - Fastest inference server in existence (110 tok/s for cohere R+, with medusa speculation) - fp8 support - mixtral 8x22b support ! (al...
- Tweet from Xenova (@xenovacom): Introducing MusicGen Web: AI-powered music generation directly in your browser, built with š¤ Transformers.js! šµ Everything runs 100% locally, meaning no calls to an API! 𤯠Served as a static websi...
- Tweet from Andrew Ng (@AndrewYNg): LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to develo...
- Vision Language Models Explained: no description found
HuggingFace ā· #general (85 messagesš„š„):
-
Langchain Learning Inquiry: A participant expressed an interest in learning langchain to build an agentic LLM, but received advice from another member suggesting that it might be more efficient to implement a custom solution.
-
Seeking ML Community Insights: A survey link was shared by students researching the democratization of ML, asking for participation from the machine learning community. The survey was accessible through this link.
-
File Conversion Hiccup: A member encountered an issue while converting HuggingFace safetensors to llama.cpp GGUF, receiving an āis not a directoryā error. They were advised to ensure the path ends before the file name in the command.
-
Unsolicited Academic Abstract Spitfire Explained: A user experienced issues with llama.cpp generating unsolicited content when started in interactive mode, inadvertently outputting abstracts like āAnti-fungal properties of silver nanoparticlesā. The discussion moved towards seeking a solution or a correct command to make the interaction responsive to user input.
-
Exploring Decoder-only Models for SQUAD: An inquiry was made regarding how to postprocess decoder-only model outputs, like Mistralās, for SQUAD evaluation. The member was looking for inspiration from open github repos for handling such a task.
Links mentioned:
- zero-gpu-explorers (ZeroGPU Explorers): no description found
- Inpainting: no description found
- The Democratisation of Machine Learning - Survey: Thank you for taking the time to answer this survey about peopleās experience with machine learning, it should take no more than 5 min Throughout this survey 'Machine Learning' will be referr...
- Unlock AI Agent real power?! Long term memory & Self improving: How to build Long term memory & Self improving ability into your AI Agent?Use AI Slide deck builder Gamma for free: https://gamma.app/?utm_source=youtube&utm...
HuggingFace ā· #today-im-learning (3 messages):
-
Exploring Knowledge Graphs: A member shared a blog post discussing how to improve Chatbot performance by integrating Knowledge Graphs, providing a link to explore the concept further.
-
The Quest for Quantization Knowledge: A member is learning about quantization through a short course offered by Deep Learning AI, indicating ongoing education in machine learning optimization techniques.
-
Multilingual Text Retrieval with RAG: A member asked for tips on implementing an efficient retrieval system using Retrieval-Augmented Generation (RAG) for a multilingual set of texts, and is looking for updates or best practices in multilingual scenarios.
Link mentioned: ML Blog - Improve ChatGPT with Knowledge Graphs: Leveraging knowledge graphs for LLMs using LangChain
HuggingFace ā· #cool-finds (7 messages):
-
Splatter Art with Speed: The Splatter Image space on HuggingFace is a quick tool to generate splatter art.
-
Diving into Multi-Modal RAG: A speaker from LlamaIndex shared resources about Multi-Modal RAG (Retrieval Augmented Generation), showcasing applications that combine language and images. Discover how RAGās indexing, retrieval, and synthesis processes can integrate with the image setting in their documentation.
-
LLM User Analytics Unveiled: Nebuly introduced an LLM user analytics playground thatās accessible without any login, providing a place to explore analytics tools. Feedback is requested for their platform.
-
ML Expanding into New Frontiers: The IEEE paper highlights an interesting scenario where Machine Learning (ML) can be widely applied. The paper can be found at the IEEE Xplore digital library.
-
Snowflake Introduces Top Text-Embedding Model: Snowflake launched the Arctic embed family of models, claiming to be the worldās best practical text-embedding model for retrieval use cases. The family of models surpasses others in average retrieval performance and is open-sourced under an Apache 2.0 license, available on Hugging Face and soon in Snowflakeās own ecosystem. Read more in their blog post.
-
Multi-Step Tools Enhancing Efficiency: An article on Medium discusses how multi-step tools developed by LangChain and Cohere can unlock efficiency improvements in various applications. The full discourse is available in the provided Medium article.
Links mentioned:
- Splatter Image - a Hugging Face Space by szymanowiczs: no description found
- Multi-Modal Applications - LlamaIndex: no description found
- Nebuly AI: no description found
- Snowflake Launches Practical Text-Embedding Model for Retrieval use Cases: Snowflake-arctic-embed is available to the open source community under an Apache 2.0 license.
HuggingFace ā· #i-made-this (19 messagesš„):
-
BLIP Model Fine-tuned for Prompts: The BLIP model has been fine-tuned to generate long captions suitable for image prompts, with a live demo accessible on Hugging Face. Check out the enhanced capabilities here.
-
Model Comparison Made Easy: A Hugging Face Space comparing different image captioning models has been published and duplicates the existing comparison space by another user. Explore the model comparisons.
-
Support for Maximum Output Length in Serverless Inference: Queries were made about max output length for model inference via curl, and it was clarified that parameters supported in transformersā pipelines can be used, including
max_new_tokens
. -
IP-Adapter Playground Unveiled: A new Hugging Face Space featuring IP-Adapter, which allows for text-to-image, image-to-image, and inpainting functionalities using images as prompts, has been launched. Dive into the IP-Adapter Playground.
-
āPush to Hubā Added to Transformersā Pipelines: The main branch of the transformers library now includes a
push_to_hub
method, allowing pipeline outputs to be pushed directly to the Hugging Face Model Hub. Users can try this feature from the main branch or wait for the next release.
Links mentioned:
- Grounded SAM - a Hugging Face Space by EduardoPacheco: no description found
- IP-Adapter Playground - a Hugging Face Space by tonyassi: no description found
- Nebuly AI: no description found
- Comparing Captioning Models - a Hugging Face Space by unography: no description found
- Detailed parameters: no description found
- Pipelines: no description found
- Pipelines: no description found
HuggingFace ā· #computer-vision (11 messagesš„):
-
Seeking an SDXL Tagger Upgrade: A member inquired about alternative taggers to the wd14 tagger for SDXL, searching for improved options.
-
Quest for PDF to LaTeX Conversion Tools: A member asked if thereās any open-source PDF to LaTeX converters, or an image to LaTeX converter capable of processing an entire PDF page, including text and mathematical expressions, without requiring exact positioning.
-
LaTeX-OCR for Equation Conversion: It was pointed out that thereās a good open-source repository for converting images of equations into LaTeX code: LaTeX-OCR on GitHub, which utilizes a Vision Transformer (ViT).
-
No Perfect LaTeX Conversions for Text: The conversion of text to LaTeX is complex due to LaTeX compilers and package particularities, leading to the opinion that manual rewriting may be more functional.
-
Selective Text Extraction Challenge: A user is looking for a method to extract one specific line of text from an image, based on the largest and boldest font. It was recommended to try Paddle OCR for this task.
Link mentioned: GitHub - lukas-blecher/LaTeX-OCR: pix2tex: Using a ViT to convert images of equations into LaTeX code.: pix2tex: Using a ViT to convert images of equations into LaTeX code. - lukas-blecher/LaTeX-OCR
HuggingFace ā· #NLP (17 messagesš„):
-
LoRA Configuration Queries: A member is experimenting with their LoRA configuration and is seeking advice on the implications of setting the bias to āallā, ānoneā, or ālora_onlyā.
-
Preparing Dataset for Fine-tuning Roberta: One member is looking for guidance on preparing a CSV dataset with over 100,000 entries and 20+ features for fine-tuning a ROBERTA model for a question-answering chatbot. Following up, they clarified that the dataset includes details about pharmaceutical drugs with diverse columns such as release date and drug type.
-
BERTopic for Topic Modeling: A member recommended BERTopic, a topic modeling technique using š¤ transformers and c-TF-IDF, and reports satisfaction with the results, though thereās a current challenge to convert seed words to phrases for creating topic models.
-
Seeking T5 Training Code with HF Trainer: A member inquires where to find training code for T5 using Hugging Faceās Trainer. Another member shared a link to EleutherAIās GitHub repository with open-source scripts for an improved T5 and suggested looking into simpleT5 for a more straightforward approach.
-
Resuming Model Download in AutoModelForVision2Seq: A member questions how to resume a model download process using AutoModelForVision2Seq, but did not receive a direct response.
Links mentioned:
- Home: Leveraging BERT and a class-based TF-IDF to create easily interpretable topics.
- GitHub - EleutherAI/improved-t5: Experiments for efforts to train a new and improved t5: Experiments for efforts to train a new and improved t5 - EleutherAI/improved-t5
- GitHub - Shivanandroy/simpleT5: simpleT5 is built on top of PyTorch-lightningā”ļø and Transformersš¤ that lets you quickly train your T5 models.: simpleT5 is built on top of PyTorch-lightningā”ļø and Transformersš¤ that lets you quickly train your T5 models. - Shivanandroy/simpleT5
HuggingFace ā· #diffusion-discussions (8 messagesš„):
- Truncated Tokens Concern: A user mentioned that truncated tokens, such as āhdrā in their prompt, are being ignored, implying a potential problem in processing. There was agreement on this issue, but no solution provided in the discussion.
- Compel Library Maintenance: In response to the truncated token problem, the Compel library was mentioned, but there is a concern that it may not currently be maintained.
- Model for Analysis and Text Generation from Video: A request for a model capable of analyzing video content to generate titles and descriptions was posed, but the discussion thread does not provide a solution.
- Solicitation for Test Method Roast: A user shared a link to a testing method/suite and requested some constructive criticism from a user perspective. The content of the test method/suite was not discussed.
- Resume Hugging Face Model Training: A user asked about the necessary code changes required to resume a Hugging Face model, but no answers have been given in the conversation.
OpenAccess AI Collective (axolotl) ā· #general (44 messagesš„):
-
Idefics2ās Grand Entrance: A brand new multimodal model, Idefics2, is available now, accepting both image and text inputs and boasting improved OCR and visual reasoning over its predecessor, Idefics1. It has been released with two checkpoints, featuring base and fine-tuned versions, and is licensed under Apache 2.0.
-
Pre-emptive Strike by NVidia?: Rumors are circulating that NVidia might expedite the launch of the RTX 5090, possibly as early as June 2024 at the Computex tradeshow, in response to competitive pressure from AMDās new advancements.
-
Hardware Conversations on AI Training: Members discussed the feasibility of using Nvidiaās A6000 GPUs for training and inference with models such as QLoRa, debating on the sufficiency of VRAM and potential requirement for more powerful setups.
-
Cosmo-1b Forgetting and Merging Experiments Revealed: In experiments to compare training methods aimed at reducing catastrophic forgetting, Model Stock merge revealed potential in combining various training solutions. The sharing of detailed comparison stats in training set validation results stirred interest in further exploring the strengths of different fine-tuning approaches.
-
Technical Dig into Dora and QLoRa: Users engaged in a technical discussion about the effectiveness of new parameter-efficient fine-tuning (PEFT) methods like Dora, comparing it to QLoRa, discussing configuration details, and noting the peculiarities in performance and resource consumption of each method.
Links mentioned:
- Nvidiaās RTX 5090 and 5080 could arrive much sooner than expected, but thereās a big catch: Leaks point to the new Nvidia Blackwell GeForce GPUs arriving much sooner than originally expected, thanks to competition from AMD.
- HuggingFaceM4/idefics2-8b Ā· Hugging Face: no description found
OpenAccess AI Collective (axolotl) ā· #axolotl-dev (2 messages):
- Inquiry on Bot Utility: A user expressed curiosity with a simple āOooooo how do I use this?ā indicating interest in understanding the botās functions.
- Spam Alert: A spam message aimed at the entire group advertised inappropriate content with a Discord invite link.
OpenAccess AI Collective (axolotl) ā· #other-llms (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #manticore (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #general-help (13 messagesš„):
-
Clarifying Role of ātrain_on_inputā Flag: A discussion on the ātrain_on_inputā parameter unfolded, revealing that disabling it means the model doesnāt calculate loss for the input portion, hence not predicting it anymore. This clarifies that the input forms part of the context during training regardless, but with the parameter off, the model wonāt be steered by the input in terms of loss calculation.
-
Understanding Loss in Training: It was highlighted that loss is indeed a crucial aspect of training as it guides model improvement, and disabling ātrain_on_inputā stops this process for the input part. If the eval setting is not enabled, this process becomes even less relevant to the modelās learning.
-
Query About Cost and OnlyFans Link: One member inquired about the cost of an unspecified service, and another user posted what seems to be a promotional message for an OnlyFans related link inviting members to join another Discord server with the promise of exclusive content.
OpenAccess AI Collective (axolotl) ā· #datasets (3 messages):
- Inappropriate Content Alert: The channel experienced an instance of spam advertising OnlyFans leaks and explicit content with an invite link to a Discord server.
- Community Watchdogs in Action: Members quickly identified the spam and labeled it as pornspam, alerting others about the inappropriate nature of the messages.
OpenAccess AI Collective (axolotl) ā· #rlhf (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #hippogriff (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #minotaur (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #bots (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #community-showcase (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #runpod-help (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #deployment-help (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #docs (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #shearedmistral (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #replicate-help (1 messages):
aquash1553: @everyone Best OnlyFans Leaks & Teen Content š š discord.gg/s3xygirlsss
OpenAccess AI Collective (axolotl) ā· #axolotl-help-bot (36 messagesš„):
-
Simplifying Epoch-wise Model Saving: A member queried about configuring Axolotl to save a model only at the end of training and not every epoch. The solution involved adjusting
save_strategy
in the training arguments to"no"
and implementing a custom callback for a manual save upon training completion. -
Choosing a Starter Model for Fine-Tuning: When asked for a suitable small model for fine-tuning, āTinyLlama-1.1B-Chat-v1.0ā was recommended due to its manageability for quick experiments. Members were guided to the Axolotl repository for example configurations like
pretrain.yml
. -
Guidance on Axolotl Usage and Data Formatting: There was a discussion on concepts like
model_type
,tokenizer_type
, and how to format datasets for Axolotl training, particularly in relation to using the āTinyLlama-1.1B-Chat-v1.0ā model. For the task of text-to-color code generation, it was suggested to structure the dataset without āsystemā prompts and upload it as a Hugging Face dataset if not already available. -
CSV Structure Clarification for Dataset Upload: Clarification was sought on whether a one-column CSV format is needed for uploading a dataset to Hugging Face for use with Axolotl. The formatted examples should be line-separated, with each line containing the input and output structured as per model requirements.
-
Posting Inappropriate Content: A user posted a message promoting unauthorized content, which is not relevant to the technology-oriented discussion of the channel nor adhere to community guidelines.
Links mentioned:
- OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
- OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
- OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
- OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
- OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
- OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
OpenAccess AI Collective (axolotl) ā· #axolotl-phorm-bot (5 messages):
-
Inquiry on Model Fine-Tuning: A member sought advice on how to preprocess data for fine-tuning the TinyLlama model with a specific dataset containing color codes and descriptions. The goal is to train TinyLlama to predict a color code from a given description.
-
Guidance on Model Preparation: A response outlined steps for fine-tuning TinyLlama by preparing the dataset in a usable format and performing tokenization and formatting suitable for the task. No specific details or links were provided in the response.
-
Irrelevant Content Posted: An off-topic message advertising OnlyFans leaks and content was posted to the channel. The message provided a Discord join link.
Link mentioned: OpenAccess-AI-Collective/axolotl | Phorm AI Code Search: Understand code, faster.
Latent Space ā· #ai-general-chat (68 messagesš„š„):
- Comprehensive LLM Benchmarks Available: An informative website llm.extractum.io has been shared which provides a detailed overview of open-source language models ranked by various benchmarks. The models are rated using ELO scores, HuggingFace leaderboard scores, and several task-specific accuracy measurements.
- AI Agents Employing Humans: An innovative project called Payman AI was introduced, enabling AI agents to pay humans for tasks they canāt perform themselves. This service aims to support a symbiotic relationship between AI and humans across various sectors like design, coding, and law.
- AI Inference Integrated into Supabase: Supabase has announced an easy-to-use API for running AI inference models within its edge functions. A new session initialization allows AI models like
gte-small
to process inquiries directly within the database service. - Anticipating āLlama 3ā Launch: Discussions include speculations and rumors about the release of āLlama 3ā, with anticipation building within the community. The context suggests that the reveal of Llama 3 may be linked to an upcoming hackathon in London.
- OpenAIās API Expansion Ahead of GPT-5: OpenAIās introduction of updates to the Assistants API has been brought to light, encouraging discussion about the directions the company could be taking, particularly with the possible launch of GPT-5 on the horizon. Users are debating the quality and performance of such platforms and the potential impact on AI startups.
Links mentioned:
- Cheaper, Better, Faster, Stronger: Continuing to push the frontier of AI and making it accessible to all.
- Payman - Home: no description found
- Tweet from Ahmad Al-Dahle (@Ahmad_Al_Dahle): Iāll be sharing more on Llama 3 very soon. Itās so cool to see what the community is already building with Llama 2 though. One of my favorites: @team_qanda & @UpstageAI used it to build a math-specifi...
- Tweet from Susan Zhang (@suchenzang): MBPP might've also been used somewhere in the Phi-1.5 dataset. Just like we truncated one of the GSM8K problems, let's try truncating the MBPP prompts to see what Phi-1.5 will autocomplete wi...
- Research Grants: no description found
- AI Inference now available in Supabase Edge Functions: Use embeddings and large language models on the edge with Supabase Edge Functions.
- Tweet from OpenAI Developers (@OpenAIDevs): Introducing a series of updates to the Assistants API š§µ With the new file search tool, you can quickly integrate knowledge retrieval, now allowing up to 10,000 files per assistant. It works with our...
- Tweet from Russell Kaplan (@russelljkaplan): Second order effects of the rise of large language models:
- Tweet from Yohei (@yoheinakajima): A marketplace for AI agents to hire humans š§ āļø Quoting tyllen (@0xTyllen) Excited to introduce a new project I've been working on called Payman! Payman is an AI Agent tool that gives Agent...
- Tweet from Armand Joulin (@armandjoulin): Fixed the fix. āļø Quoting Jonathan Frankle (@jefrankle) Fixed it for you, @code_star
- Payman - Enabling AI Agent To Human Payments!: Hey everybody, in this video, I'm super excited to show you Payman, a platform that allows you to connect your agents with capital that they can use to pay h...
- LLM Explorer: A Curated Large Language Model Directory. LLM List. 35061 Open-Source Language Models.: Browse 35061 open-source large and small language models conveniently grouped into various categories and llm lists complete with benchmarks and analytics.
Latent Space ā· #ai-announcements (1 messages):
- Paper Club Meeting on BloombergGPT: A BloombergGPT discussion is scheduled, with
<@315351812821745669>
leading it, supported by<@451508585147400209>
. Participants are reminded to sign up here and note the return to Zoom due to past Discord screenshare issues.
Link mentioned: LLM Paper Club (BloombergGPT / TimeGPT paper) Ā· Zoom Ā· Luma: This week @yikes will be covering BloombergGPT: https://arxiv.org/abs/2303.17564 Also submit and vote for our next paper:ā¦
Latent Space ā· #llm-paper-club-west (19 messagesš„):
- Acknowledgment of Efforts: A member expresses appreciation for the time and effort the community members put into organizing the event.
- Zoom Meeting Transition: It was announced that the discussion would move from Discord to a Zoom meeting, with multiple members sharing the same link and directing the participants to the new location.
- Quick Zoom Reminder: Further notifications were posted tagging specific members, prompting them to join the Zoom meeting.
- Zoom Entry Request: A member mentioned their dislike for Zoom but indicated their intention to join, asking for admission into the meeting.
Link mentioned: Join our Cloud HD Video Meeting: Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom ā¦
OpenInterpreter ā· #general (59 messagesš„š„):
-
AI Wearables vs Smartphones: A user shared a YouTube review by Marquis Brownlee and generated discussion around the limitations of AI wearables compared to modern smartphones. The conversation touched on the potential need for AI assistants to have deep contextual knowledge for more efficient responses.
-
Anticipation for the OpenSource WizardLm2: Members express enthusiasm for the WizardLm2 model, praising its perceived freedom from censorship and the significant leap towards GPT-4 level capabilities in an open-source model. Discussions hint at the perpetual desire for the next improvement even as current advancements are celebrated.
-
Translation Bot Testing and Objectives: The new translation bot is under examination, with goals to facilitate more inclusive conversations by translating both ways. Users seem optimistic about its potential to unify discussions.
-
Communal Quest for Windows Compatibility: Multiple users are voicing their struggles to get software, particularly the 01 Light software, to function on Windows. The conversation reveals a pressing need for Windows support to make enterprise inroads and the challenges faced with Mac-oriented setups.
-
Exploring Hardware Options and Personal AI Aspirations: Thereās active chatter about various AI hardware options like the Limitless device, with users comparing personal experiences and desires for an integrated, personal AI assistant. Some spotlight the importance of backend infrastructure and seamless integration as the next frontiers in AI hardware development.
Link mentioned: The Worst Product Iāve Ever Reviewed⦠For Now: The Humane AI pin is⦠bad. Almost no one should buy it. Yet.MKBHD Merch: http://shop.MKBHD.comTech Iām using right now: https://www.amazon.com/shop/MKBHDInā¦
OpenInterpreter ā· #O1 (17 messagesš„):
- Portable O1 Setup Brainstorming: A member shared their aim to create a somewhat portable O1 setup using an RPi5 to run OI, with Arduino components involved. Others suggested that simpler, cheaper components like the m5 atom could be sufficient and asked about the memberās specific goals for the setup.
- Shipping Dates for O1 Mystery: In response to an inquiry about an unspecified item or product, a member mentioned that shipping is aimed to start by the end of summer, but no specific dates are confirmed yet.
- Terminal Choices for Successful Responses: Users discussed their preferences for terminal applications, with one member successfully using Windows Terminal and Powershell to get responses. There was a mention of difficulties with recognizing the OpenAI key in Powershell for Windows 10.
- Batch Files as a Workaround in Windows: A member admitted to using a batch file because they found it more convenient, implying that it is processed by cmd.exe rather than Powershell, highlighting the quirks of Windows.
- Troubleshooting Request for Latest Branch: There was a request for testing the latest branch due to several people experiencing issues with connection establishment and audio uploading.
Links mentioned:
- no title found: no description found
- no title found: no description found
- no title found: no description found
Interconnects (Nathan Lambert) ā· #ideas-and-feedback (11 messagesš„):
- Significant Improvement in Winrate: A member shared a project update, highlighting a method that improved the winrate of qwen-1.5-0.5B from 4% to 32% against AlpacaEval, Phi-2, and Gemma2b-it, using a combination of generation in chunks and a small (300M) reward model for output searching.
- Seeking Validation for a Simple Method: The same member mentioned the simplicity of their method that led to increased winrate on a 500M base model, and sought feedback to verify the effectiveness of this approach.
- Relevance of Reranking LLM Outputs: Another community member acknowledged that reranking LLM outputs during inference is a known practice but was unsure if it had been applied to AlpacaEval before; also referencing a paper on reranking and pruning during parallel generation.
- Research Papers as verification: The previous member then provided links to papers discussing the approach, indicating that the terms verifier/reward guided decoding are associated with the method, including 2305.19472 and 2402.01694
- Underexplored but Promising: A member agreed on the potential of such an underexplored area, implying that concepts like MCTS PPO might also be worth examining.
Interconnects (Nathan Lambert) ā· #news (17 messagesš„):
-
Mixtral-8x22B LLM Gains Attention: A new model called Mixtral 8x22B has been touted for setting high performance and efficiency standards. Itās an SMoE model, fluent in several languages, capable of function calling, and offers a 64K token context window, all under the Apache 2.0 license.
-
Mixtral-8x22B-Instructās Chatbot Capabilities Discussed: The instruct fine-tuned version of Mixtral-8x22B, Mixtral-8x22B-Instruct-v0.1, garnered attention for its potential in the chatbot arena, featuring detailed instructions on how to run the model.
-
Impressive OLMo 1.7 7B Model Upgrade: OLMo 1.7 7B has made waves with its 24 point increase on MMLU, training on an improved version of the Dolma dataset and staged training. Itās part of a series of models designed to promote the science of language models.
-
A Proposal for Web Page Quality Propagation: The idea of applying āweb page qualityā propagation to rank web pages was floated, involving a quality score boosted by backlinks and decreased by linking to low-quality sites.
-
Reflection on Common Crawlās Dense Web Graph: The complexity of evaluating āqualityā content based on Common Crawlās web graph was discussed, noting that the graph does not indicate the success of the linearization process (the conversion of HTML into plain text).
Links mentioned:
- Cheaper, Better, Faster, Stronger: Continuing to push the frontier of AI and making it accessible to all.
- mistralai/Mixtral-8x22B-Instruct-v0.1 Ā· Hugging Face: no description found
- Tweet from Albert Jiang (@AlbertQJiang): I love open-sourced models! Please add your favourites to the Mistral Convex Hull. āļø Quoting Philipp Schmid (@_philschmid) Fixed the Fixed Fix for @AI21Labs and included Mambas. š
- Common Crawl - Web Graphs: Detailing Common Crawl's Web Graph releases, the technology behind them, and how to use them.
- allenai/OLMo-1.7-7B Ā· Hugging Face: no description found
- Tweet from Philipp Schmid (@_philschmid): Fixed the Fixed Fix for @AI21Labs and included Mambas. š āļø Quoting Armand Joulin (@armandjoulin) Fixed the fix.
Interconnects (Nathan Lambert) ā· #ml-questions (9 messagesš„):
- Chinchilla Paper Under Scrutiny: The Chinchilla scaling paper by Hoffmann et al. is facing replication challenges, with discrepancies found when others tried to replicate a key part of the research.
- Doubts Cast on Scaling Law Papers: A member expressed skepticism about the conclusions from scaling law papers, hinting at issues with the math upon closer examination of the Chinchilla paper.
- Community Engagement Over Questions with Chinchilla: Discord users are engaging with the issue, sharing brief reactions of concern and surprise, using phrases such as āChinchilla oops?ā and simply āoh noā to express discomfort regarding the situation.
- Authors Non-responsive to Clarification Requests: One of the replication attempters mentioned that they reached out to the original authors for clarification but did not receive any response, adding to the frustration within the community.
Links mentioned:
- Tweet from Tamay Besiroglu (@tamaybes): We have asked the authors for assistance, but we havenāt been able to get a response. (8/9)
- Tweet from Susan Zhang (@suchenzang): After ignoring the details in all these "lets-fit-a-cloud-of-points-to-a-single-line" papers (all likely wrong when you really extrapolate), @stephenroller finally convinced me to work through...
- Tweet from Tamay Besiroglu (@tamaybes): The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's ...
Interconnects (Nathan Lambert) ā· #ml-drama (1 messages):
natolambert: shittiest leaderboard winner lol
Interconnects (Nathan Lambert) ā· #random (23 messagesš„):
- WizardLLM Code Inquiry: A community member inquired about forking the WizardLLM code; another confirmed that model weights are publicly available suggesting it may return soon.
- Anticipation for olmo vs llama 3: Multiple members engaged in a light-hearted discussion about olmo vs llama 3, with a suggestion that a new battle may be upcoming, despite humorous resignation to its outcome.
- Forecast for Prolific Blogging: Nathan Lambert hinted at a potentially heavy week of content sharing, expecting to possibly release three blog posts.
- Discussing Aesthetic Changes in the Chaotic Era: Conversations in the Chaotic Era included tweaks to user interface annoyances and personal preferences on profile imagery.
- Twitter and Memes Conversation: Members chatted casually about their Twitter activity, shareability of content, and the possibility of oneās post aligning with āsacred numerologyā due to coincidental abbreviation.
Interconnects (Nathan Lambert) ā· #reads (3 messages):
- AI Livestream Hijinks on SNL: Nathan shared a humorous YouTube video titled āBeavis and Butt-Head - SNL,ā which shows a NewsNation livestream event on AI being comically disrupted. He particularly noted the first minute as being very amusing.
Link mentioned: Beavis and Butt-Head - SNL: A NewsNation livestream event on AI is derailed by two audience members (Ryan Gosling, Mikey Day).Saturday Night Live. Stream now on Peacock: https://pck.tv/ā¦
Interconnects (Nathan Lambert) ā· #sp2024-history-of-open-alignment (1 messages):
natolambert: should I wizardLM 2 as a troll lol
Cohere ā· #general (54 messagesš„):
- Cohere API Clarifications Sought: Members are seeking clarifications on Cohere API functionality, with particular interest in API capabilities around system prompts and model availability. One user bumps the question, emphasizing the need for detailed information.
- Cohere Embeddings Benchmark Inquiry: Questions have arisen about whether Cohereās embeddings v3 have been compared with OpenAIās new large embeddings. A link is provided to Cohereās blog where related information can be found: Introducing Command R+.
- Integration Challenges and Solutions: Members are addressing technical queries regarding integrations, specifically in connecting LLMs to other platforms like BotPress, and there are discussions about whether Coral requires a locally-hosted solution. One member suggests a future update may address this.
- Fine-Tuning Model Confusion: One user queries about the ability to fine-tune already fine-tuned models through Cohereās Web UI, leading to a discussion on the process and a shared link to the official documentation: Fine-Tuning with the Web UI.
- Discord Welcomes and Personal Projects: Various new members introduce themselves, and excitement is shared about Cohereās offerings. Discussion threads include mentions of personal projects, such as PaperPal, built using Cohereās Command R.
Links mentioned:
- no title found: no description found
- Screenshot-2024-04-16-151544 hosted at ImgBB: Image Screenshot-2024-04-16-151544 hosted in ImgBB
- Fine-tuning with the Web UI - Cohere Docs: no description found
- Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets: Cohere Embed now natively supports int8 and binary embeddings to reduce memory cost.
- GitHub - Unstructured-IO/unstructured: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines. - GitHub - Unstructured-IO/unstructured: Open source librar...
- GitHub - cohere-ai/sandbox-conversant-lib: Conversational AI tooling & personas built on Cohere's LLMs: Conversational AI tooling & personas built on Cohere's LLMs - cohere-ai/sandbox-conversant-lib
- GitHub - cohere-ai/quick-start-connectors: This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and businesses to perform seamless retrieval-augmented generation (RAG) on their own data.: This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and businesses to perform seamless retrieval-augmented generation...
Cohere ā· #project-sharing (3 messages):
-
Beta Testers Wanted for Quant Fino: A new pilot has been deployed for an Agentic entity powered by Command-R Plus, looking to blend GAI with FinTech and Day Trading. They are currently seeking beta testers and feedback, with information available at Join Beta - Quant Fino with details on their cookie policy and user consent.
-
Inquiry About Rubikās API: A member expressed interest in utilizing Rubikās via an API with post request support. They are awaiting further details on whether such an API is available.
-
Redteaming Reveals Vulnerabilities in Command R+: A member has done redteaming work on the Command R+ model, identifying potential for creating unrestricted agents with capabilities for nefarious tasks. They provided a detailed write-up at LessWrong, which includes examples of agent-produced messages geared towards harmful actions.
Links mentioned:
- Creating unrestricted AI Agents with Command R+ ā LessWrong: TL;DR There currently are capable open-weight models which can be used to create simple unrestricted bad agents. They can perform tasks end-to-end suā¦
- Quantfino - Home of Powerful AI Driven Finance: Quantfino is the home of LLM powered and Langchain assisted Financial Analysis.
LangChain AI ā· #announcements (1 messages):
-
Iterative Documentation Structure Improvements: The team is iterating on the documentation structure to enhance accessibility and clarity. A new organization splitting content into ātutorialā, āhow to guidesā, and āconceptual guideā is proposed, with feedback requested on the structure via the provided link.
-
LangChain Framework Introduction Highlighted: The provided link introduces LangChain, an open-source framework for building applications with large language models. It details how LangChain facilitates development, productionization, and deployment through building blocks, LangSmith, and LangServe, and includes a diagrammatic overview.
Link mentioned: Introduction | š¦ļøš LangChain: LangChain is a framework for developing applications powered by large language models (LLMs).
LangChain AI ā· #general (38 messagesš„):
-
Seeking YC Startup Insights: A member has expressed interest in applying to YC for a startup focused on finetuning models for agents and is inquiring if anyone knows whether this has already been done. Another member responded by listing companies like Unsloth, Mistral AI, and Lumini that are in this space.
-
Collaborative Effort Wanted for LLM Applications: Thereās an open call for those working on LLM applications to join in short conversations, with one member promptly expressing willingness to do so.
-
Langchain Learning Curve: A query about whether learning Langchain is worthwhile received lighthearted responses suggesting that one should learn by doing and encouraging hands-on experimentation with the technology.
-
Update on Handling Tabulated Data in Langchain: Multiple users discussed handling multiple CSV files with Langchain for a chatbot, with suggestions ranging from using an SQL agent to different methods of utilizing CSV files and handling larger data sets effectively.
-
Exploring RAG Optimization: Users have brought up the challenge of dealing with large documents using RAG, where strategies like pre or post-index splitting were discussed, and one member shared their pursuit of optimizing RAG for better accuracy.
-
Looking for a Hiring Point Person: A new participant greeted the channel and is seeking the appropriate contact person for discussions about hiring.
-
Venture into Multi-Agent Frameworks: A member pointed towards AutoGen, a framework provided by Microsoft for multi-agent conversations and workflows, and sparked curiosity among users in multi-agent orchestration within Langchain.
-
AI Startups Funding Database Unveiled: A comprehensive fundraising database for AI startups has been shared, featuring impressive data collection on financing rounds and companies, including insights from GPT-4 with an invitation for feedback on possible data inaccuracies.
Links mentioned:
- AutoGen | AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
- Flashcardfy - AI Flashcard Generator with Personalized Feedback: Learn faster and smarter with AI-generated flashcards that provide personalized feedback.
- Agents | š¦ļøš Langchain: LangChain offers a number of tools and functions that allow you to create SQL Agents which can provide a more flexible way of interacting with SQL databases. The main advantages of using SQL Agents ar...
LangChain AI ā· #langserve (1 messages):
- Integration Challenges with LangServe and Nemo Guardrails: A member inquired about difficulties encountered when trying to integrate LangServe with a chain that includes Nemo Guardrails, as Nemo alters the output structure significantly. They mentioned the necessity for a novel output parser to handle these changes.
LangChain AI ā· #share-your-work (4 messages):
-
Galaxy AI Introduces Multitude of Free APIs: GalaxyAI has released a free API service allowing access to premium AI models such as GPT-4, GPT-3.5-turbo, and Langchain Integration, all in the OpenAI format. Check out their offerings and integrate them into your projects at Galaxy AI.
-
OppyDev Launches AI-Powered Coding Tool: OppyDev released an AI assisted coding platform that combines an IDE with a chat client, featuring ease of use, a focus on transparency, customization, data control, and uses LLMs like GPT-4 and Claude. See a demo and learn more at OppyDev AI.
-
Rubiks.ai Calls for Beta Testers for Advanced Research Assistant: A new advanced research assistant and search engine, Rubiks.ai, seeks beta testers to try out features including Claude 3 Opus, GPT-4 Turbo, and Mistral Large powered by Groqās servers for rapid responses. Interested individuals can explore and sign up at Rubiks.ai with a promo code
RUBIX
for 2 months of free premium access. -
Unveiling The Power of Multi-Step Tools: An article discusses the benefits of multi-step tools integrated with LangChain and Cohere, aimed at enhancing efficiency. Read more about this advancement in the full article at AI Advances.
Links mentioned:
- Galaxy AI - Swagger UI: no description found
- Home - OppyDev: Collaborative AI Agent that Elevates your Coding Experience
- Rubik's AI - AI research assistant & Search Engine: no description found
LangChain AI ā· #tutorials (5 messages):
- Seeking Collaboration: A participant expressed interest in joining a project and requested a direct message to discuss further details.
- Tutorial on AI Agents with Long-Term Memory: A member shared a YouTube video that explains how to imbue AI agents with long-term memory and self-improvement capabilities, providing insight into advanced AI agent development.
- Query on Langgraph Usage: In response to the shared video about AI agent long-term memory, a member questioned why the concept of ālanggraphā wasnāt considered for implementation.
Link mentioned: Unlock AI Agent real power?! Long term memory & Self improving: How to build Long term memory & Self improving ability into your AI Agent?Use AI Slide deck builder Gamma for free: https://gamma.app/?utm_source=youtube&utmā¦
DiscoResearch ā· #mixtral_implementation (12 messagesš„):
- Pushing the Limits of GPU Memory: Maxidl reported successful operation with full-scale deep-speed (FSDP), a sequence length of 32k, and batch size of 1 while utilizing a whopping 64 80GB GPUs, hugging close to capacity at 77GB utilized per GPU.
- 64 GPUs Not a Typo: When questioned, maxidl confirmed the use of 64 GPUs, noting that reducing to 32 GPUs resulted in out-of-memory (OOM) errors, thus necessitating the larger GPU count.
- Optimization Possibilities Explored: Considering memory constraints, maxidl mentioned the potential of 8-bit optimization to conserve memory during training.
- Memory Usage Optimization Suggestion: jp1 suggested using
fsdp_transformer_layer_cls_to_wrap: MixtralSparseMoeBlock
and enablingoffload_params = true
for improved memory usage, anticipating it should fit within 32 GPUsā VRAM. - Seeking Memory Requirement Calculators: Maxidl inquired about tools to calculate memory usage of model activations by model size and sequence length, citing a HuggingFace discussion on model memory requirements for Mixtral models.
Link mentioned: mistral-community/Mixtral-8x22B-v0.1 Ā· [AUTOMATED] Model Memory Requirements: no description found
DiscoResearch ā· #general (8 messagesš„):
-
Gray Area in Text Scraping: A member voiced the opinion that most scraped text data are, from an EU copyright perspective, at least in a gray area. They also mentioned that texts from DFKI could be useful but did not have the link at hand.
-
Finding Multimodal Data: A member suggested sources for multimodal data with permissive licenses, like Wikicommons and other platforms listed on Creative Commons Search.
-
Llama Tokenizer Simplified: An individual shared a Google Colab notebook illustrating how to create a Llama tokenizer without relying on HuggingFace, using sentencepiece instead.
-
Query on Tokenizer Spelling: Following a discussion on custom tokenizers, a member pointed out a misspelling in a shared tokenizer, misspelling MuadāDib.
-
Modernizing Tokenization Techniques: A contributor highlighted that Mistral has released their tokenization library, potentially aiding in standardized finetuning processes without custom wrappers, and provided a link to the example notebook on GitHub.
Links mentioned:
- CC Search Portal: no description found
- Google Colaboratory: no description found
- mistral-common/examples/tokenizer.ipynb at main Ā· mistralai/mistral-common: Contribute to mistralai/mistral-common development by creating an account on GitHub.
DiscoResearch ā· #benchmark_dev (1 messages):
- Decoding Strategies for Language Models Analyzed: A member referenced a paper titled āA Thorough Examination of Decoding Methods in the Era of LLMs,ā expressing concerns that it didnāt cover open-ended tasks relevant to their LLM usage experience. They also mentioned that modern sampling methods by u/kindacognizant, such as MinP/DynaTemp/Quadratic Sampling, arenāt covered in such papers.
- Surprising Impact of min_p Sampling on Creative Writing: The same member shared a Reddit post detailing a comparison of min_p sampling parameters and their significant effect on creative writing performance. The comparison showed an increase of +8 points in alpaca-eval style elo and +10 points in the eq-bench creative writing test.
Link mentioned: Reddit - Dive into anything: no description found
tinygrad (George Hotz) ā· #general (9 messagesš„):
- Tinygrad and INT8 support query: A member asked if tinygrad supports int8 computations, to which another replied affirmatively. The location where this is defined wasnāt provided.
- Hardwareās Role in Defining Tinygradās Computations: A user mentioned that whether tinygrad supports certain data types, like int8, is typically defined by the hardware capabilities rather than tinygrad itself.
- Enhanced Graph Visualizations for Tinygrad: An inquiry was made about improved graph visualizations in tinygrad, and a reply directed to the Tiny-tools Graph Visualization for slicker graphs than
GRAPH=1
. - Interest in an Optimized Node.equals() for Tinygrad: A member expressed interest in a fast, probabilistically complete Node.equals() function as a cool addition to tinygrad.
- Pytorch-Lightning Hardware Agnosticism Discussed: The hardware-agnostic nature of Pytorch-Lightning was discussed, with a link to its GitHub repository provided, and another member confirmed its use on a 7900xtx. Check out Pytorch-Lightning on GitHub.
Links mentioned:
- React App: no description found
- GitHub - Lightning-AI/pytorch-lightning: Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.: Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes. - Lightning-AI/pytorch-lightning
tinygrad (George Hotz) ā· #learn-tinygrad (9 messagesš„):
-
Exploring Metal Compute Shaders: A member is experimenting with tinygradās generation of Metal compute shaders and is interested in learning how to run a basic Metal compute shader program without using Xcode. Another suggested consulting ChatGPT for a Python script to dispatch metal shader code for a vector addition, mentioning their positive learning experience.
-
ONNX to WebGL/WebGPU Possibilities: An inquiry was made about converting models from ONNX to WebGL/WebGPU with tinygrad, specifically for running meshnet models on the web. A comparison was made to a Stable Diffusion WebGPU example, but the member is seeking advice on achieving the conversion directly from ONNX.
-
Layer Device Allocation Query in Tinygrad: A participant was concerned about the apparent lack of functionality to move layers (like Linear, Conv2d) across devices in tinygrad. George Hotz clarified that model parameters can be moved with the
to_
method called after get parameters on the model. -
Zero-Cost Tensor Manipulation in Tinygrad: A user asked for guidance on implementing broadcast, reshape, and permute operations in tinygrad without incurring data copying costs. They were directed to look at tinygrad/shape/shapetracker.py or view.py for relevant code examples.
Skunkworks AI ā· #off-topic (4 messages):
- Introducing Idefics2: A new multimodal ChatGPT called Idefics2 by Hugging Face has been introduced, which incorporates Python programming into its abilities.
- Reka Core Takes On Giants: The Reka Core language model is presented as competitive with those from OpenAI, Anthropic, and Google, touting impressive performance metrics.
- JetMoE: Budget-Friendly AI Performance: With less than $0.1 million spend, JetMoE-8B claims superior performance compared to Meta AIās LLaMA2-7B, a model backed by extensive funding.
- Snowflakeās New Text-Embedding Model: Snowflake has launched and open-sourced their Snowflake Arctic embed family of models, highlighted as the worldās best practical text-embedding model.
Links mentioned:
- Reka Core: A Frontier Class Multimodal Language Model: Reka Core is competitive with models from OpenAI, Anthropic, and Google across key industry-accepted evaluation metrics. Given its footprint and performance,...
- Introducing Idefics2 8B: Open Multimodal ChatGPT: We will take a look idefics2 the open multimodal llm by huggingfacehttps://huggingface.co/blog/idefics2#python #pythonprogramming #llm #ml #ai #aritificialin...
- Snowflake Launches the Worldās Best Practical Text-Embedding Model: Today Snowflake is launching and open-sourcing with an Apache 2.0 license the Snowflake Arctic embed family of models. Based on the Massive Text Embedding Be...
- JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars: JetMoE-8B is trained with less than $ 0.1 million1 cost but outperforms LLaMA2-7B from Meta AI, who has multi-billion-dollar training resources. LLM training...
Datasette - LLM (@SimonW) ā· #llm (3 messages):
- Anticipation for Mixtral 8x22B Instruct: Excitement for trying out the Mixtral 8x22B Instruct through llm was expressed, with a link to its Model Card on HuggingFace provided for reference.
- Issue Reported with llm-gpt4all: A user mentioned encountering an error when installing llm-gpt4all; the issue is detailed on GitHub with a link to the error report.
Links mentioned:
- mistralai/Mixtral-8x22B-Instruct-v0.1 Ā· Hugging Face: no description found
- adding the llm-gpt4all models breaks the python app. Ā· Issue #28 Ā· simonw/llm-gpt4all: I installed llm no problem, assigning my openai key, and am able to speak to gpt4 without problem, see the output of my llm models command: OpenAI Chat: gpt-3.5-turbo (aliases: 3.5, chatgpt) OpenAI...
Alignment Lab AI ā· #oo (2 messages):
- Lawyers Stepped In: A member made a brief remark suggesting that lawyers were likely involved in a certain situation, although the context of the legal implication was not provided.
- Image Illustrating Deletion of wizardlm-2: An image was shared depicting that wizardlm-2 was deleted due to a lack of testing for v0; however, the specifics of what wizardlm-2 is or what the testing involved were not given in the message. View Image
Mozilla AI ā· #llamafile (2 messages):
-
Llamafile Script Improvement: The llamafile archive version upgrade repacking script has been improved and is available at this Gist. There is a debate on whether to integrate it into the main llamafile GitHub repo due to maintenance concerns, with the notion that maintainers should create new llamafiles from the ground up.
-
Security Vulnerability Reporting Process Inquiry: A query was raised about the procedure for reporting security vulnerabilities and the subsequent request for a CVE (Common Vulnerabilities and Exposures) identification. No additional context or instructions were provided in the message.