> AI Discords for 1/25/2024. We checked **20** guilds, **297** channels, and **5875** messages for you. Estimated reading time saved (at 200wpm): **555 minutes**.

OpenAI released a new GPT4 Turbo version yesterday (our notes here). We’re using this opportunity to conduct a natural experiment for summarization. This version is generated with the ā€œnewā€ GPT4T from Jan 2024, see previous email with the Nov 2023 Jan version for comparison.


Table of Contents

[TOC]

PART 1: High level Discord summaries

TheBloke Discord Summary

  • UnSloth Strides Towards Multi-GPU Support: UnSloth is gearing up to introduce limited multi-GPU support, specifically aiming at Google Colab beginners. This move piqued the interest of AI research newcomers due to its promised simplicity.

  • AI Models Leap onto Nintendo Switch: The community buzzed with excitement as @kalomaze showcased AI models like Tiny Llama and Mistral running on a Nintendo Switch, sparking conversations about deploying lightweight models on unconventional hardware.

  • Model Configurations Dive Deep: Detailed examination of models such as exllama, bagel, and dolphin, especially focusing on rope_theta and sliding_window usage. Configurations from Hugging Face were specifically highlighted, providing valuable insights for model optimization.

  • Chatbots Gain Character and Style: Techniques for crafting chatbot personas, including using datasets generated by ChatGPT and the importance of example quantity in training for style transfer, were actively discussed. The dialogue underscored creating chatbots with distinct personalities like Samantha and exploring cost-effective training alternatives like local LLMs.

  • Innovative Model Merging Techniques Explored: The discussion ventured into the territory of optimal weight maintenance for model performance and the intrigue of merging overfit models with techniques like DARE or SLERP. The conversation suggested a complex but promising frontier for selectively combining models to harness multiple strengths.


OpenAI Discord Summary

  • GPT-4 Speeds: A Tortoise and Hare Story: Users noted GPT-4-1106-preview has significantly longer processing times for document analysis through the API, running into 3-7 minute delays. Discussions pointed to potential fixes including upgrading service plans for dedicated capacity, though GPT-4 Turbo was clarified not to inherently support direct file analysis.

  • GPT Model Troubleshooting Takes a Village: An error flagged as ā€œunusual activityā€ by GPT-4 led to speculation that VPN use, prompt nature, or account sharing could be culprits, while Dall-E faced criticism for frequent misspellings in image creations, suggesting shorter text inputs might mitigate this issue.

  • Code Blocks Get a Glow-Up with GPT-4: The ā€œAlways expand code outputā€ setting was discussed for enhancing code readability in GPT-4, with additional dialogues around importing Python modules into the AI system despite security concerns and the need for a new conversation to reflect CustomGPT edits.

  • Text Transcription Tackles GPT’s Goliath: Challenges with large text transcription, specifically sermon passages up to 50KB, were central, with GPT-3.5 struggling due to input size limits. Suggestions leaned towards utilizing GPT-4 Turbo for its larger context window capabilities and the exploration of NLP tools for more efficient paragraph chunking and correction of grammatical issues.

  • Cost Considerations Clash with Capability Needs: The dialogues underscore a constant balance between desire for advanced functionalities and cost implications, especially significant in the transition from GPT-3.5 to GPT-4 Turbo for tasks requiring intensive text processing and analysis.


Nous Research AI Discord Summary

  • Context Capabilities Extended, Mistral Mines Deeper: Innovations in extending model context capabilities dominated discussions, with methods such as fine-tuning highlighted as viable for enhancing models. Particularly, LLaMA-2-7B-Chat stunned with an extension to 16,384 context window, and SelfExtend was recommended as a fine-tune-free option.

  • Pondering Technology’s Social Dichotomy: The conversation occasionally veered off-technical paths, touching on broader impacts of technological advancement, with opinions asserting that it may further polarize society. Humorous diversions included a swimming cat GIF, while concerns were raised about Twitter’s content delivery mechanisms potentially throttling AI contributions.

  • Benches Set for Everyone Coder 33B: The spotlight shone on Everyone Coder 33B Base - GGUF, quantized using Massed Compute for the GGUF format. Speculations also surrounded the performance comparisons involving Hermes Mixtral, albeit without detailed backing data.

  • Embedding Models to Content-genie in a Bottle: OpenAI’s fresh embedding models and API enhancements were shared, flaunting lower GPT-3.5 Turbo pricing and new tools. Simultaneously, the Genie method for high-quality data generation in content-grounded tasks captured attention, suggesting significant advances in LFQA, summarization, and extraction capabilities.

  • Tech Titans and Ingenious Interventions: Spirited discussions included GPT-2 inference experiments on WebGL, challenges in cloning LLaMA with a Mixtral model, and debating model efficiency with an eye on phi2 optimizations. GPU harnessing for ML stood out as a creative pursuit, reimagining gaming hardware for scientific exploits.


OpenAccess AI Collective (axolotl) Discord Summary

  • Fine-Tuning with a Norwegian Twist: @henriklied is fine-tuning Mistral on a dataset of 100k articles for Norwegian title generation, with a recommendation to cap epochs at 4 to prevent overfitting. Meanwhile, shareGPT’s handling of extended conversations shows limitations by design, sparking a preference debate between ChatML and BlockML formats for conversation management.

  • Merging Models, Qlora on the Rise: Training QLoRA models without relying on Bitsandbytes is explored with alternatives like fp16 and AutoAWQ for quantization highlighted. A link to a GitHub guide for merging trained QLoRA models into base models was shared, alongside a quantization advocacy tweet by Tim Dettmers.

  • Dataset Developments Stir Excitement: A new dataset on Hugging Face aimed at training the Snorkel model and a Mistral 7 fine-tuning announcement indicate significant advancements. ALPCA’s 34 percent figure improvement over old GPT-4 metrics hints at considerable performance enhancements.

  • DPO Discussions Dive Deep: Queries around the DPO Training Plots and dataset compounding issues imply a community need for clearer documentation and troubleshooting guides in the realm of DPO Training and dataset integrity.

  • Showcase Shoutout: A YouTube video by pradeep1148 in the community-showcase hints at community engagement and project sharing, although specifics were not outlined.


LM Studio Discord Summary

  • Proxy Solutions and TTS Enable Seamless LM Studio Use: Users experiencing limitations with LM Studio due to regional blocks on HuggingFace have found workarounds through proxy settings and a Text-to-Speech interface, enhancing accessibility and interaction with models.

  • System Updates Fix Model Compatibility Issues: Update your C++ redistributables to resolve model loading errors in LM Studio, as advised by @heyitsyorkie, following difficulties with models like Stable Code, Deepseek, and Codellama.

  • Model Operation Queries Span Versatility and Performance: From running multiple LM Studio instances for parallel modeling to examining the best GPU options for AI work, discussions reveal a keen interest in enhancing model performance and efficiency. An RTX 3090 and M2 Mac Studio are among the recommended hardware for handling large language models.

  • Bug Reports Propel Improvements in MoE Models: Users reported bugs affecting MoE model settings in LM Studio, notably when transitioning between 4X and 2X MoE models, sparking an immediate investigation by the development team to enhance user experience and model functionality.

  • Model Exploration and Integration Challenges Highlighted: Experimentation with large models like mixtral8x7B using an RTX 3090, and frustrations with outdated APIs, underscore community efforts to push the boundaries of AI model utility and application in projects, despite facing integration and fine-tuning hurdles.


Mistral Discord Summary

  • GPU Rentals and Free Resources for AI Work: @mrdragonfox highlights runpod, vast, lambda, and Kaggle as key platforms offering GPU rentals by the hour and free resources, respectively, to facilitate AI summarization and development efforts. Kaggle provides 2 T4 GPUs for 30 hours per week, a boon for developers seeking computational power without heavy investment.

  • Evaluating LLMs Beyond Traditional Metrics: @adrienbufort criticizes traditional translation metrics like BLEU and Rouge for being inadequate in evaluating large language models (LLMs), advocating for an ELO-like evaluation system alongside MMLU and Alpaca Eval as superior methodologies. These offer closer alignments to human preferences and intra-LLM evaluation capabilities, respectively.

  • Innovations in AI Browser Interactions: @sublimatorniq introduces a groundbreaking tool enabling AI to reference DOM node references, enhancing web content interaction by making browser queries more contextually aware. This tool is designed for compatibility with MistralAI, indicating a significant leap in AI-mediated web navigation.

  • Quantization and Optimization Strategies for AI Models: Discussions in the community have unveiled serious considerations around memory requirements and efficiency for Mistral’s 4bit inference indicating a 26GB memory footprint as essential. @mrdragonfox recommends exllamav2 for its superior memory efficiency over traditional 4-bit transformers, suggesting it as an effective strategy for optimizing AI model performance.

  • API Limitations, Bugs, and Hosting Insights Disclosed: The community has encountered various issues ranging from early stopping dilemmas to a bug with the ā€œmax_tokensā€ parameter causing a 500 error when set to 1, with a specific GitHub issue outlined here. Additionally, pertinent inquiries about Mistral API’s hosting location revealed its placement in Europe on Azure in Sweden, proving crucial for developers considering data locality and compliance.


Eleuther Discord Summary

  • AutoGluon Steps In for Meta’s SAM: After failing to find a fine-tuning codebase for Meta’s SAM model, @the_alt_man turned to AutoGluon due to its support for Lightning and compatibility with GPUs, though not with TPUs.

  • Innovating Beyond Infiniband for Multi-node Training: Sparked by @elyxlz’s quest for multi-node training sans Infiniband, a discussion evolved around a strategy involving periodic merging during training steps, referencing the DiLoCo paper on distributed optimization.

  • Byte-Level Transformers and Proxy-Tuning Dive Deep: Discussions ranged from the efficiency of byte-level transformers like ByT5 and unfair sequence length comparisons in MambaByte paper, to sharing a new lightweight proxy-tuning method for LLMs as detailed in a recent publication. This encapsulates the growing interests in optimizing and understanding LLMs’ mechanisms and efficiency.

  • Chess AI Evolution Imagined: A blend of speculation and current state-of-the-art review highlighted discussions on chess AI, including Stockfish’s dominance and theoretical reflections on chess AI advancements by 2024 as envisaged in a chess.com blog.

  • GPT-NeoX, PyTorch, and CUDA Wrestle with Testing: @catboy_slim_ raised concerns regarding testing challenges after updating Python, PyTorch, and CUDA versions, creating an issue on GitHub about pytest failures. Meanwhile, QLoRA’s tuning confusion within neoX 20b was clarified, marking the importance of directing queries to the appropriate channels for library-specific issues.


Perplexity AI Discord Summary

  • Perplexity Pro Unleashes Powerful Features: Users discussed the benefits of Perplexity Pro over the regular version, highlighting features such as unlimited Copilot queries and the ability to attach images and files with models like Claude 2.1 and GPT-4. Details were supported with a link to the Perplexity Pro FAQ.

  • Privacy Safeguards Queried Amidst Data Retention Concerns: Discussion among users like @emisaurus_hex and @firesonwires arose regarding Perplexity’s privacy policy on retention of search history and personal information until account deletion, prompting an expert to clarify that deleted threads are gone from servers after 30 days.

  • Debate Over Data Storage Ethics: Users engaged in a philosophical discussion about the implications of aggregated search data for privacy, with varying opinions on the value and potential misuse of such data.

  • Community Tackles Technical Support Together: The community, including Perplexity representatives, came together to assist users with technical issues, ranging from recovering bookmarked threads to inquiries about file upload limits.

  • Exploring Perplexity’s Educational and Professional Applications: In the #sharing channel, users shared experiences using Perplexity in diverse fields such as content creation, Smartsheet learning, and astronomy education, with one user emphasizing Perplexity’s superiority over Google Search and OpenAI for content generation highlighted in a YouTube video.

  • Technical Discrepancies Between Perplexity’s Website and API: The #pplx-api channel included discussions on the performance differences between the website and API versions of a tool, with a user seeking clarifications on labs versus API parameters and another user reporting an unresolved double charge issue with [email protected].


HuggingFace Discord Summary

  • HuggingFace’s Community Expands and Innovates: The community is experiencing growth with more creators joining, highlighted by the launch of the Lightweight Open LLM Leaderboard and discussions around AI interpretability in a survey by H. Luo and L. Specia. The innovative AI Alignment model, TenyxChat-8x7B-v1, was also introduced, showcasing preference tuning with a high MT-Bench score. Practical utility projects like CheXRay and ZeroGPU were spotlighted, indicating the diverse explorations within HuggingFace’s community.

  • Real-World AI Discussions Focus on Practicalities: Community members shared insights from selecting models for RTX 3080 GPUs, to feature extraction techniques and the realities of model pretraining. The conversation also touched on the essentials of data management for training and evaluation, alongside the benefits of leveraging community blog posts for wider outreach on projects, such as the upcoming wav2vec2-bert model release.

  • Dataset Evaluation Frameworks and Data Quality: The community raised the need for comprehensive frameworks to evaluate datasets used in training LLMs, suggesting an area ripe for development given the current focus on model rather than dataset evaluation.

  • Flutter and OpenAI Innovations Lead Cool Finds: Fascination with HuggingFace’s dataset documentation analysis and a detailed study titled ā€œBinocularsā€ for detecting LLM-generated text were among captivating discoveries. The integration efforts of ONNX runtime with Flutter and a specific Flutter SDK for HuggingFace Inference APIs underscore the community’s efforts to bridge AI with application development.

  • WhisperSpeech and AI Development Highlights: The WhisperSpeech space signals a leap in multilingual TTS, while an error in the CheXRay demo pointed to challenges in deploying AI in medical imaging. There’s anticipation for a Nemo Model blog post and a call for broader engagement through community-contributed content.

  • Artificial Intelligence and Machine Learning Breakthroughs: Google’s Lumiere showcased an advanced approach in T2V models, while a Medium article on benchmarking received praise. In NLP, the TorToiSe TTS from Coqui demonstrated both quality and potential speed improvements.

  • Technical Insights from #diffusion-discussions and #computer-vision: Questions regarding loading LoRA models locally and the inquiries about computational vision techniques emphasized the technical depth of discussions in the HuggingFace community.

  • Gradio 4.16 Launch Boosts Developer Experience: The latest Gradio release champions features aimed at streamlining ML app development, like native support for Polars Dataframe, enhancing interactivity and performance in ML deployment.

Each bullet encapsulates thematic discussions and announcements spanning various HuggingFace channels, reflecting the guild’s vibrant engagement with cutting-edge AI technologies and community-driven initiatives.


LAION Discord Summary

  • LAION2b-en Aesthetic Scores Go Dark: The LAION2b-en aesthetics scores dataset sought by @ppwwyyxx is currently unavailable on Hugging Face, as it was disabled by the author’s request, with the community encouraged to stay tuned for updates.

  • A Leap Toward Open Source Voice Chat Innovations: A new voice chat interface demo featuring Whisper and WhisperSpeech along with an open-source LLM (Dolphin 2.6 Phi-2) was shared by @jpcl_, promising lower latency and more natural conversation flow. Interested parties are invited to collaborate to further enhance the project, details of which are available on Hacker News.

  • VA’s AI Tech Sprint Calls for Competitors: The VA’s AI Tech Sprint focuses on Ambient Dictation for Clinical Encounter Notes, with a $300K first prize, eyeing advancements in AI-driven healthcare. U.S. residents are encouraged to participate, with more information available on Challenge.Gov.

  • Byte-Level Transformers Show Promising Future: Byte-level transformers may herald a new frontier in AI capabilities, supported by optimistically cautious insights from @marianbasti after a review of a pertinent research paper.

  • Innovative Developments in AI Image Generation: Breakthrough projects like RPG-DiffusionMaster and InstantID set new standards in text-to-image diffusion and ID-preserving generation, illustrating the rapid pace of advancement in this area. Additionally, a discussion highlighted the non-diminishing potential for growth in AI image generation, referencing the continual emergence of new papers and tools in the field.


LlamaIndex Discord Summary

  • LLMCompiler Webinar Promises Parallel Function Calling Revolution: A last-minute webinar featuring authors Sehoon Kim and Amir Gholami, discussing the LLMCompiler, aims to highlight its superiority over sequential frameworks like ReAct, focusing on long-term planning and parallelization capabilities. Relevant resources include the LLMCompiler paper, LlamaPack, and a GitHub notebook.

  • LlamaIndex’s OSS and Partnerships Expand Its Ecosystem: LlamaIndex launches an Open Source Software (OSS) repository with a guide on building a Slack bot, partners with Zilliz Universe to enhance its platform with a scalable retrieval service, and announces Day 0 support for OpenAI’s latest embedding models. Additional offerings include effective prompting and a new TypeScript version (LlamaIndex.TS version 0.1.0), offering ease of use and broad support capabilities.

  • Dynamic Discussions Shape LlamaIndex’s Community: Community members probe the availability of TextGenerationInference LLM for LlamaIndex, explore customizing the Chat Engine with similarity_top_k parameters, address challenges in implementing insurance domain queries, and express enthusiasm for new embedding models and API updates from OpenAI. Custom prompts guidance for extending contexts in queries signifies community’s active engagement in refining LlamaIndex’s usability.

  • Cutting-Edge AI Discussions Flourish Amongst Peers: Zep emerges as a noteworthy tool for enhancing chatbots with production-grade capabilities and entity extraction, while LlamaIndex is positioned as a flexible data orchestration tool comparable to Amazon Kendra. A Substack article showcases a self-learning RAG that utilizes recursive retrieval, automated knowledge graph creation, and memory/multi-hop reasoning, pushing boundaries in AI applications.

  • Continuous Integration and Learning in AI Applications: The sharing of resources, queries, and solutions within the LlamaIndex guild underscores a vibrant community dedicated to ongoing learning, integration of cutting-edge tools, and exploration of advanced AI techniques. These discussions reveal a concerted effort to harness the full potential of AI technologies like LlamaIndex and Zep, in various domains including bots, data retrieval, and self-learning knowledge graphs.


Latent Space Discord Summary

  • MORPHEUS-1 Dreams Big: Prophetic AI announces MORPHEUS-1, the first multi-modal generative ultrasonic transformer for lucid dream induction, targeting a beta release in Spring 2024. The reveal tweet promises groundbreaking capabilities.
  • Rapid YAML Custom Tags Experimentation: go-go-golems showed off an ambitious project milestone, achieving 5k lines of code in just 4 days, as part of an experiment with yaml-custom-tags. Their progress can be tracked on GitHub.
  • Martian’s LLM Evaluation Tool Launches: Martian introduced a tool to dynamically evaluate which LLM inference product to use based on cost, throughput, and time to first byte (TTFT), across providers like Anyscale and Fireworks. The launch was detailed here, with accompanying evaluation methodology.
  • LLM Paper Club Asia Edition Kickoff: The LLM Paper Club expands to Asia with a session on the seminal paper ā€œAttention Is All You Needā€. Interested persons are encouraged to sign up and engage via a specific Discord link.
  • Pythia Paper Next on US Club’s Agenda: The US paper club’s next discussion will focus on the Pythia paper, which explores metrics for analyzing Large Language Models (LLMs) throughout training and scaling. The paper and authors are detailed in this link.

DiscoResearch Discord Summary

  • Mixtral Finetuning Gets a Deep Dive: In mergekit’s GitHub issue, @philipmay initiated a discussion on the finetuning potential of Mixtral models, exploring the ā€œhiddenā€ and ā€œrandomā€ options. The conversation underscored the importance of auxiliary loss in MoE training as crucial for effectiveness.

  • Rethinking Data Quality for AI Models: A new study highlighted in the community argues against traditional data quality filtering methods, suggesting a model-aware approach for data selection could lead to better performance.

  • Kahneman-Tversky Optimisation (KTO) Explored: The community discussed KTO, comparing it with Direct Preference Optimization. This method, simplified via binary good and bad training examples, was evaluated for its implementation viability in contextual AI and production model updates, with resources like Hugging Face’s documentation serving as guides for potential adoption.

  • OpenAI’s Strategic Model and Pricing Updates : Significant updates were shared regarding OpenAI’s launch of GPT-4 Turbo and a price reduction for GPT-3.5 Turbo as detailed in an official tweet. These changes signal strategic shifts in embedding options and pricing, indicating an evolving landscape for AI model accessibility.

  • Embedding Innovations and German AI Model Enhancements: The community discussed the upcoming release of a new German Jina model for ranking tasks and shared insights into using Mixtral for advanced question generation and embedding development. A highlight was the new embedding models and API updates from OpenAI, hinting at expanded capabilities in multilingual support. Further, the Genie method, suggested by a shared paper, promises a novel approach for generating high-quality data, potentially advancing the creation of more effective question-answer pairs or summaries.


LLM Perf Enthusiasts AI Discord Summary

  • OpenAI Revamps with Fresh Embedding Models and API Tools: OpenAI has launched a new wave of embedding models, updated GPT-4 Turbo, and moderation tools, embarking on improved performance and cost efficiency for developers. This suite includes two new embedding models, enhancements to GPT-4 Turbo, a novel text moderation model, and forthcoming lower prices for GPT-3.5 Turbo, with a clear stance on not using customer data for model training.

  • Community Buzzes Over New Embedding Capabilities: The introduction of new embedding models and updates to GPT-4 Turbo stirred discussions among community members, with specific excitement around the abbreviated embeddings feature. Additionally, comparison of the new large embedding model against bge-large revealed slight performance supremacy, underscoring OpenAI’s advancements.

  • Upgrade Conversations Sparked by OpenAI Updates: With OpenAI’s latest offerings, community members are considering system upgrades to leverage the new embedding models. The debate includes potential cost savings from the newer, larger embedding model due to dimension shortening, comparing embedding expenses against savings on vector database storage.

  • Efficiency and Cost Implications Discussed: Conversations delved into the efficiency and potential cost benefits of OpenAI’s newest embedding model, weighing the performance against storage and API costs. The dialogue highlighted a leaning towards the v3-large model for its improved performance and the promise of reduced storage costs, mirroring a broader interest in maximizing both operational efficiency and cost-effectiveness.

  • Misplacement of Announcement Draws Minor Attention: A notable mention was a suggestion to redirect the discussion on the new embedding models and API updates to a more appropriate channel, presenting a minor hiccup in communication flow within the community. This pointed to a procedural note on channel relevance and highlights the importance of targeting the right audience within communal platforms.


LangChain AI Discord Summary

  • LangChain Powers Up Projects and Personalities: @quarknova, an ENS student, is exploring LangChain for a project, raising questions about GitHub versus commercial versions. Meanwhile, creating AI personalities like ā€œan Elon Musk AIā€ involves finetuning, with a short course recommended for practical insights.

  • Rapid Chatbot and Data Generation Developments: Users celebrated creating a web-search chatbot using LangChain and Streamlit for its speed and ease, while also discussing the role of LLMs in generating synthetic data for machine learning training and RAG generation. For PARQUET file enthusiasts, solutions involve pandas and DataFrameLoader for integration into LangChain.

  • Diving Deeper with LangServe: LangServe enthusiasts are directed to examples on GitHub, including tips for constructing custom tools or agents with an emphasis on LCEL and LangGraph for added expressiveness, available here and here. Also, troubleshooting stream response issues in agent implementations highlighted ongoing discussions.

  • Exploring Context Awareness and SQL Integration: Curiosity about LangChain AI’s use of webpage context was raised by dejoma, indicating interest in the AI’s context awareness capabilities. johnny2x2 shared a breakthrough using LangChain for LLM automation in manufacturing to analyze late customer orders leveraging ERP SQL data, with the implementation of Mixtral 7B v2 5Q 32K for effective database management and specialized task loops.


Datasette - LLM (@SimonW) Discord Summary

  • Major LLM Upgrade on the Horizon: @simonw announced an upcoming LLM update that promises substantial improvements to the openai library, with comprehensive testing instructions listed on GitHub.
  • Sneak Peek into LLM’s Future Developments: For a detailed view of what lies ahead, @simonw shared insights through the 0.13 Milestone on GitHub, highlighting the planned enhancements for the command-line access to large language models.
  • Community Call to Tackle Readline Bug: @simonw is calling for community support to fix a readline bug in LLM chat that causes arrow keys to produce ANSI codes rather than navigating text, with more information available on GitHub.

Skunkworks AI Discord Summary

  • Curiosity Stirs in Skunkworks: Arielnlee inquires if there’s ongoing work on bakklava-2, sparking discussions on the future developments in the bakklava-1 channel.
  • Mystery Link Dropped: Pradeep1148 shares an enigmatic YouTube video in the #off-topic channel, content and context unspecified.

PART 2: Detailed by-Channel summaries and links

TheBloke ā–· #general (1212 messagesšŸ”„šŸ”„šŸ”„):

  • UnSloth Heading Towards Multi-GPU Support: UnSloth’s development is actively discussed, with a hint that limited multi-GPU support may be introduced months later. The OSS version is notably targeted at Google Colab beginners, sparking discussions on its ease of use for newcomers in AI research.

  • AI-Developed Discord Bots and Fine-Tuning Chatter: Members, including @flail_., discuss their adventures into Discord bot making with Python, acknowledging the convenience of Python libraries despite personal preferences for other languages. There’s also a light-hearted debate on the professional credentials one could claim from fine-tuning language models.

  • Academicat’s Use in Research and Ethical Funny Side: @kaltcit shares insights on using mouse fingers for PCR in biomedical research, humorously noting the ability to crack jokes post-amputation. This opens an ethical and scientific discussion on research methodologies and the use of animals.

  • Concerns About the Open LLM Leaderboard: Discussions reveal skepticism towards the Open LLM Leaderboard, particularly regarding the quality of models it promotes. @fiddlenator criticizes the leaderboard for endorsing ā€œtrash merge models,ā€ sparking a conversation on the validity and evaluation of such models.

  • Tiny Llama’s Nintendo Switch Execution Sparks Interest: Members express astonishment and curiosity as @kalomaze shares links to videos purportedly showing AI models like Tiny Llama and Mistral running on a Nintendo Switch. This unique execution leads to discussions on the potential for lightweight model deployment on unconventional hardware.

Links mentioned:


TheBloke ā–· #characters-roleplay-stories (74 messagesšŸ”„šŸ”„):

  • Technical Inquiries on Model Configurations: Users in the channel discussed the configurations of various models like exllama, bagel, and dolphin, focusing on aspects like rope_theta and whether to use a sliding_window. For example, @dreamgen pointed out differences in configuration files for models, highlighting discrepancies in rope_theta values and the use of sliding_window (Hugging Face Bagel Config, Hugging Face Dolphin Config).

  • Discussions on Role-Playing Models: Various users shared and sought advice on the best role-playing models with 7 billion parameters, with @kalomaze recommending models like Kunoichi DPO v2 and Fett-uccine (Fett-uccine on Hugging Face) while also discussing suitable quantization options for different VRAM capacities.

  • Concerns About Sensitive Content Models: The conversation briefly touched upon the use of models for generating content that could be considered sensitive, with a general consensus towards responsible use and the application of existing models through careful prompting, highlighted by @c.gato and @superking__.

  • Clarifications and Corrections: Users like @jondurbin provided clarifications regarding their use of model configurations inherited from base models, adding context to the discussion around customizing models for specific purposes (Mistral base model config).

  • Token Streaming and Text-to-Speech (TTS) Integration Efforts: @mrdragonfox and @stoop poops shared their progress and challenges in integrating token streaming and TTS capabilities into bots, aiming for real-time implementation and exploring efficient APIs and engines like realtimeTTS with coqui engine.

Links mentioned:


TheBloke ā–· #training-and-fine-tuning (20 messagesšŸ”„):

  • Crafting a Character for Chatbots: @superking__ suggested writing examples of how a character would respond to different user inputs and training the model on this dataset. They further mentioned the possibility of asking ChatGPT to help generate this dataset.

  • Quantity Matters in Training Chatbots: In a discussion about the number of examples needed for training a chatbot, @amogus2432 mentioned that between 10 and 100 examples should suffice for style transfer in training a qlora. However, @dirtytigerx countered that for Samantha-level performance, one would need around 6k multiturn conversation samples, suggesting that with only 100 samples, one might end up overfitting.

  • Personal Touch in Chatbot Conversations: @lordofthegoons expressed a desire to create a chatbot with a specific persona and consistent conversation style, akin to the Samantha model. They voiced concerns about generating a synthetic dataset that achieves this without seeming repetitive or limited.

  • Exploring Cheaper Alternatives for Training Chatbots: Discussing the potential cost of using GPT-4 API for generating a dataset, @dirtytigerx recommended exploring local Large Language Models (LLMs) as a more economical option. They mentioned using platforms like runpod for experimenting could be cheaper and more efficient than relying solely on services like ChatGPT, which may encounter rate limits.

  • Venturing into Financial Advisor Chatbots: @VR inquired about creating a personal financial investment advisor chatbot, contemplating the use of prompt tuning and fine-tuning on financial documents and datasets. They sought advice on how to effectively deploy a model within the constraints of a 24GB GPU and leverage the latest stock prices, summaries, trends, and expert analyses.


TheBloke ā–· #model-merging (19 messagesšŸ”„):

  • Optimal Weight Hypothesis Discussed: @sanjiwatsuki shared a hypothesis that maintaining a weight slightly over 1.0 might be optimal for model performance due to the TIES resolution process causing some of the effective weight to drop. The idea is to compensate for this anticipated drop.
  • Uncertainty Over Negative Weights: When @kquant asked if negative numbers break the script, @sanjiwatsuki expressed uncertainty but suggested that the code might handle it without issues based on their rough memory.
  • Exploration of Censored Model Assimilation: @kquant mentioned a curiosity about experimenting with super censored models and whether it’s possible to assimilate models selectively, taking mostly desired features.
  • Selective Merging with DARE and SLERP: @kquant proposed using techniques like DARE or SLERP for merging models with high performance in different areas, such as ARC and MMLU, to optimize for multiple strengths without simply averaging scores.
  • Intrigue Over Successful Overfit Model Merging: @kquant shared surprise at two overfit models maintaining their test positions after being merged through SLERP, questioning the expected negative impact of overfitting on such operations.

TheBloke ā–· #coding (1 messages):

  • LangChain Fine-Tuning Query by Newbie: @nandavikas inquired about fine-tuning Llama2 using LangChain for extracting specific information from PDFs as a dictionary. They shared their experience with PyTorch and sought guidance on a similar process or useful documentation within LangChain’s ecosystem.

OpenAI ā–· #ai-discussions (35 messagesšŸ”„):

  • GPT-4 Speed Discrepancy Concerns: @romansh2302 expressed frustration with the long processing times of GPT-4-1106-preview for document analysis through the API, experiencing delays of 3-7 minutes compared to faster speeds on the web version. @rendo1 and @lugui explained that performance variances could be due to API usage peaks or server resource allocation, suggesting that there isn’t a straightforward fix beyond possibly upgrading service plans for dedicated capacity.

  • Queries on GPT-4 Turbo Stability: When asked if GPT-4 turbo stable version will address the speed issues, @elektronisade clarified that GPT-4 Turbo doesn’t inherently support direct file analysis, indicating a possible misunderstanding of feature availability.

  • Alternatives and Cost Inquiries for Enhanced API Service: @romansh2302 considered using GPT-3.5 Turbo as an alternative for document analysis but found it to be less effective. Dialogues with @lugui led to the revelation that dedicated API capacity might be available but at undefined costs tailored to individual use cases, primarily targeting corporate clients.

  • Unusual Activity and Error Troubleshooting: @hellomyfriend1576 encountered an error message about ā€œunusual activityā€ from their system when using GPT-4, with attempts to resolve through clearing cookies and network changes. @lugui and @og_tort speculated it could be related to VPN use, the nature of prompts, or account sharing practices.

  • Discussion on Dall-E Misinterpretations: @alwayspercipient noted Dall-E’s frequent misspellings in image creations, with @muyfashionista suggesting that this issue might be less pronounced with shorter text inputs and providing links to community discussions and tips for minimizing grammatical errors in generated images.

Links mentioned:

TuringsSolutions/PFAF750 Ā· Datasets at Hugging Face: no description found


OpenAI ā–· #gpt-4-discussions (105 messagesšŸ”„šŸ”„):

  • Clarification on ā€œAlways Expand Code Outputā€ Feature: @angry_coder inquired about the ā€œAlways expand code outputā€ setting, leading to @darthgustav. clarifying that it ensures code blocks are wrapped for easier reading. After testing, @darthgustav. confirmed it pertains to wrapped code blocks, enhancing readability.

  • Exploring Python Code Importation: Users @bambooshoots and @darthgustav. discussed the possibility of importing Python modules by uploading a zip file and adjusting the system path. Despite initial security concerns, they speculated about its potential utility and extended functionality within OpenAI restrictions.

  • CustomGPT Edits Require New Conversations: @elegante94 asked if edits to a CustomGPT would reflect in active sessions, to which @darthgustav. responded that a new conversation is necessary to experience changes. This was a point of confusion for @elegante94 who had been making iterative edits under this misconception.

  • Optimizing Prompt Language for GPT: @elegante94 sought advice on whether attaching images or using complex wording would improve GPT outputs for creative imagery. @darthgustav. advised that GPT cannot interpret images in prompts and recommended precise language for better outputs, while also acknowledging the potential exploration of themes like multi-agent interactions via Microsoft Autogen and CrewAI.

  • Challenges with GPT Bot Updates and Mimicry Restrictions: @rodney.leonardo experienced difficulties saving updates to a GPT bot designed to serve as a product design assistant. @darthgustav. suggested troubleshooting steps and noted restrictions against mimicking living individuals, which led to a realization that naming specific living persons like ā€œJony Iveā€ may block the bot from being saved.


OpenAI ā–· #prompt-engineering (558 messagesšŸ”„šŸ”„šŸ”„):

  • Exploring Paragraph Chunking and NLP for Sermon Transcription: @doublefelix is working on a project involving the transcription of sermons, seeking to split large blocks of text (up to 50KB files) into manageable paragraphs and correct grammar issues. They explore the use of GPT-3.5 and NLP techniques as potential solutions, facing challenges with input size limits and finding adequate strategies for effective paragraph chunking.

  • The Path Towards an Efficient Workflow: After multiple trials with GPT-3.5, including varied prompt strategies and attempts at leveraging API functionalities, @doublefelix encounters obstacles in ensuring that the AI properly acknowledges and processes the entirety of the input without hallucinations or significant errors.

  • Considering GPT-4’s Capabilities for Enhanced Context Management: @darthgustav suggests utilizing GPT-4 Turbo, which can handle larger context windows and leverage Python tools for semantic analysis and paragraphing, thereby potentially bypassing the limitations encountered with GPT-3.5. This approach may offer a more efficient workflow for processing and splitting the transcription files.

  • Reflecting on Project Costs and Constraints: Concerns arise regarding the increased costs associated with moving to GPT-4 Turbo for processing the transcriptions. @doublefelix expresses a desire to find a balance between functionality and affordability, considering both the manual labor involved and the financial implications of using a more advanced AI model.

  • Exploration of Alternative NLP Tools and Final Thoughts: As @doublefelix weighs their options, including a potential fallback plan using GPT-4 Turbo, they also plan to explore NLP packages such as semantic-text-splitter for a possible solution to their text processing needs. Gratitude is expressed for the discussion and insights provided, highlighting the complexities of prompt engineering and the nuances of leveraging AI models for specific project goals.

Links mentioned:

How do you maintain historical context in repeat API calls?: Each time I make a call to the API it starts off with no prior context, unlike the chat.openai.com scenario. Is there a way to maintain state of the model during a session? response = openai.Completi…


OpenAI ā–· #api-discussions (558 messagesšŸ”„šŸ”„šŸ”„):

  • Exploring GPT’s Limitations on Large Texts: @doublefelix ventured into a discussion on how to effectively split a large block of text into smaller, manageable segments for paragraphing and grammar correction using GPT. Initially attempted with GPT-3.5, they faced challenges with the AI ignoring parts of the text or hallucinating content.

  • Semantic Clustering and Paragraphing Strategies: @darthgustav suggested multiple strategies to aid @doublefelix in his endeavor, including semantic clustering for paragraph breaks and leveraging Custom GPT or GPT-4 Turbo’s advanced capabilities and Python Tool for more accurate text processing.

  • The Cost of Processing Large Texts: The conversation also touched upon the cost implications of processing large volumes of text. @doublefelix discovered that the processing costs with GPT-3.5 were higher than initially anticipated, prompting a consideration of GPT-4 Turbo despite its higher cost.

  • NLP vs AI in Text Segmentation: There’s an ongoing debate whether Natural Language Processing (NLP) tools or AI (specifically GPT models with enhanced context windows and Python capabilities) would serve better in segmenting and correcting large transcription files. @doublefelix is considering exploring NLP tools, found a potential package on PyPi, and remains open to utilizing GPT-4 with structured instructions for improved efficiency.

  • A Quest for Automated Text Structuring: Towards the end, @doublefelix reflected on the journey, acknowledging the challenges faced and the insights gained from interacting with GPT models and structured instructions for segmenting large text files. Though some progress was made with GPT-3.5, there remains a pull towards exploring GPT-4’s capabilities or NLP tools for a more automated approach.

Links mentioned:

How do you maintain historical context in repeat API calls?: Each time I make a call to the API it starts off with no prior context, unlike the chat.openai.com scenario. Is there a way to maintain state of the model during a session? response = openai.Completi…


Nous Research AI ā–· #ctx-length-research (7 messages):

  • Exploring Best Solutions for Extending Context Capabilities: @cryptossssun inquired about the best current methods for enhancing context capabilities, stirring a constructive dialogue among members. @_cherrry recommended fine-tuning based on a discussed paper as a viable method.

  • Debate on Mistral’s Strategy Changes: Raising questions about Mistral Instruct v0.2’, @dreamgen highlighted the disabling of the sliding window technique in favor of scaling rope_theta, sharing the config.json to underline the shift. This prompted speculation on whether sliding window techniques were less effective for long contexts.

  • MistralLite Mimics Mistral Strategy: Further inquiries by @dreamgen led to observations that amazon/MistralLite employs a similar approach to Mistral regarding context window tactics, keeping the discussion focused on evolving model configurations.

  • Impressive Context Window Extension Achievements: @stellaathena showcased an extraordinary feat of extending LLaMA-2-7B-Chat’s context window to 16,384 with minimal samples and training steps. This method brags about remarkable efficiency, drawing amazed reactions.

  • SelfExtend Suggested as Fine-Tune-Free Option: When @cryptossssun asked for advice on extending context capabilities, @leontello pitched SelfExtend as an innovative solution for those opting not to fine-tune. This introduces another angle to the ongoing discussion of enhancing model performance without intensive training.

Links mentioned:

config.json Ā· mistralai/Mistral-7B-Instruct-v0.2 at main: no description found


Nous Research AI ā–· #off-topic (5 messages):

  • Polarized Effects of Technology: @ldj shared the view that advancement in technology might polarize society further, with the most incapable becoming more degenerate and the most capable engaging in even more self-improvement. This reflects the ongoing discussion about the socio-technological impact on different demographic strata.
  • Unexpected Laughter with Swimming Cat: A sudden humorous turn was introduced by @Error.PDF with a Cat Swimming GIF, lightening the mood amidst more serious tech discussions.
  • Twitter’s Mysterious Post Acceleration: @fullstack6209 observed a significant increase in the rate of new posts appearing on Twitter, from 2-3 every 10 minutes to about 70 a minute, raising questions about changes in the platform’s content delivery algorithms.
  • Speculations on Twitter Slowing Down AI: Following up, @fullstack6209 shared suspicions about Twitter intentionally slowing down AI to manage or control content flow, reflecting concerns about the interaction between social media platforms and artificial intelligence technologies.
  • One-Click Quantization of LLMs to GGUF: @pradeep1148 shared a YouTube video demonstrating how to easily quantize any hf LLMs to GGUF format in a single click, marking significant progress in the field of machine learning and neural networks efficiency.

Links mentioned:


Nous Research AI ā–· #benchmarks-log (2 messages):

  • Benchmarking Request for Everyone Coder 33B Base - GGUF: @231912337869635584 was asked by @benxh to conduct a human eval benchmark on the Everyone Coder 33B Base - GGUF model. This model, created by rombo dawg, was quantised using Massed Compute and supports the new GGUF format, a replacement for GGML.

  • Interest in Hermes Mixtral Performance: @teknium expressed a desire to see performance benchmarks for Hermes Mixtral but provided no further details or links.

Links mentioned:

TheBloke/Everyone-Coder-33B-Base-GGUF Ā· Hugging Face: no description found


  • OpenAI Announces New Embeddings Models and API Updates: @tsunemoto shared a link detailing OpenAI’s launch of new embedding models, alongside updates to the GPT-4 Turbo and moderation models. There will be lower pricing for GPT-3.5 Turbo, new API usage management tools, and a commitment that data sent to OpenAI’s API will not be used for training their models.

  • Introducing Genie for Content-grounded Data Generation: @metaldragon01 highlighted a recently published paper on Genie, a novel method aiming to overcome the shortage of high-quality data for content-grounded generation tasks. Genie uses a three-stage process (Content Preparation, Generation, and a Filtering mechanism) to create task-specific examples that are natural and high-quality, notably for Long-Form Question-Answering (LFQA), summarization, and information extraction.

Links mentioned:


Nous Research AI ā–· #general (361 messagesšŸ”„šŸ”„):

  • GPT-2 Inference on WebGL Explored: @_3sphere and @n8programs discussed implementing GPT-2 inference in WebGL, with n8programs sharing a detailed kernel for vector similarity computation using ThreeJS and GLSL. The conversation highlighted the potential for machine learning models to run directly in browser graphics pipelines.

  • LLaMA Cloning Attempts and Discussion: @balnazzar3047 and @euclaise discussed creating a Mixtral model similar to LLaMA but encountered challenges understanding the model-building process. Helpful resources and explanations were shared, including a Colab notebook by @qnguyen3 for initializing new models.

  • Debate on Model Efficiency: @carsonpoole shared updates on fine-tuning the phi2 model with RMSNorm and plans for further optimization, hinting at potential efficiency gains over traditional Transformer models. The approach sparked interest in comparing parallel residual architectures for inference and training speed.

  • Word2Vec on Steroids with User Feedback: @everyoneisgross innovated on Word2Vec by introducing a method to refine and expand corpus data based on user inputs, utilizing Mistral Instruct for additional context. This approach, though simple, was praised for its effectiveness and potential in lightweight NLP tasks.

  • Harnessing GPU for Advanced ML Computations: @n8programs detailed the use of WebGL for machine learning, explaining how 3D textures and spatial thinking enable computations beyond typical GPU limits. This method, seen as ā€œhaxing on the back of video games,ā€ showcases creative leveraging of gaming hardware for scientific computing.

Links mentioned:


Nous Research AI ā–· #ask-about-llms (35 messagesšŸ”„):

  • Specs for Finetuning CodeLlama 34B Debated: @ganymede123 inquired about the ideal workstation specs for finetuning CodeLlama 34B, considering 4xA6000 GPUs. @teknium responded that this setup would only suffice for a qlora, suggesting a full DGX might be necessary for complete finetuning.

  • Troubleshooting T5 Fine-Tuning Issues: @maxpappa discussed problems with aligning a fine-tuned version of T5, experiencing deterministic outputs and steady reward-accuracies. Suggestions from @locutusque and @carsonpoole included avoiding paged 8bit Adam, considering numerical instability in T5, and clamping infs, especially in the encoder.

  • Exploring LLMs for Offensive Cyber: @useewhynot sought recommendations for LLMs suitable for offensive cyber operations or Capture The Flag (CTF) competitions. @kenakafrosty and @georgejrjrjr recommended WhiteRabbitNeo models available on HuggingFace, highlighting their focus on such tasks.

  • Libraries for LLM Fine-Tuning: In response to @moconna asking about preferred libraries for LLM fine-tuning, @kenakafrosty mentioned using trl and expressed a preference for axlotl, indicating its benefits for certain applications.

  • Choosing the Best Coding LLM and Fine-Tuning Advice: @findmyke queried about the current best LLM for coding, with @.ben.com directing them to the EvalPlus leaderboard for comparison. Moreover, @moconna inquired about essential hyperparameters for fine-tuning Mistral with Llama Factory, indicating a focus on learning rate and seeking suggestions for defaults or templates.

Links mentioned:


Nous Research AI ā–· #project-obsidian (3 messages):

  • Seeking Simple 3b Obsidian Python Script: vic49. is in search of a simple Python script that utilizes the transformers library, with remote code execution enabled, for working with the 3b Obsidian model.
  • Code Refactoring in Progress: qnguyen3 responded by promising a refactoring of the entire code to ensure compatibility with the latest llava repo, addressing vic49.’s request.

OpenAccess AI Collective (axolotl) ā–· #general (219 messagesšŸ”„šŸ”„):

  • Fine-tuning Mistral for Norwegian: @henriklied is fine-tuning Mistral on a collection of 100k articles for title generation in Norwegian, with specific model parameters shared. However, @le_mess suggests stopping at 4 epochs to avoid overfitting and advises that 10 epochs might be too many iterations for a limited dataset.

  • shareGPT Shortcomings in Handling Long Conversations: @c.gato raises concerns about shareGPT’s handling of conversations that exceed a predefined context length, indicating that long conversations are dropped rather than trimmed to maintain context length. @le_mess confirms this behavior is by design in Axolotl for anything other than completion tasks.

  • Discussion on Effective Chat Representations: @suikamelon and @c.gato debate the efficacy of ChatML versus raw chat formats for managing conversation contexts in AI models, with preferences varying based on token efficiency and flexibility. Additionally, @dreamgen and @c.gato discuss the potential benefits of alternative representations like BlockML and IRC formatting.

  • Training Model Considerations: @dreamgen inquires about single-GPU performance for H100 vs A100 in the context of training models like QLoRA, with @c.gato noting the cost-efficiency of H100 for a specific task. Discussions also touched on changing rope theta and its relation to scaling techniques in model training.

  • Challenges with Deploying AI Models: @dangfutures seeks assistance for hosting their model, highlighting difficulties with serverless deployment and discussing potential solutions. Conversations highlight the complexities involved in selecting and deploying the appropriate hosting solutions for AI models.

Links mentioned:


OpenAccess AI Collective (axolotl) ā–· #axolotl-dev (1 messages):

  • Active Development Leads to Message Deletion: @gahdnah deleted their previous message after noticing the area of discussion is still under active development, following a check of the latest commits. No specific details about the development area or commits were provided.

  • A Grant Celebration: The axolotl-dev community celebrated receiving a grant with enthusiastic emojis. The message contained no details on the grant’s purpose or the grantor.


OpenAccess AI Collective (axolotl) ā–· #general-help (47 messagesšŸ”„):

  • Solutions for Training Qlora Without Bitsandbytes: @matanvetzler inquired about training Qlora models without Bitsandbytes for compatibility with VLLM. @le_mess and @stefangliga suggest alternatives like using fp16 or AutoAWQ for quantization, and merging trained Qlora models into the base model. Additionally, @stefangliga shared a GitHub link for merging and a tweet by Tim Dettmers advocating for model quantization before merging.

  • Discussion on Model Merging Strategies: @nanobitz and @c.gato engaged in discussions about the efficacy and potential issues of merging models with different quantization adapters, indicating a slight improvement in performance but also highlighting complexities in merging multiple adapters. @c.gato mentions a 1-2% improvement but advises caution due to potential issues.

  • Training and Evaluating Models with Custom Prompts: @sadaisystems sought advice on configuring custom prompts and dataset formats for model training. Concerns were raised about unusually low training losses, to which @caseus_ and others suggested this might be expected behavior for deterministic tasks like SQL queries. The conversation emphasizes the importance and challenges of interpreting model training performance.

  • Continuous Pretraining and Benchmark Evaluation on Mistral: Queries about continuing pretraining on Mistral and evaluating models during training were raised by @nickbro0355 and @sadaisystems. @caseus_ provided clues on using continuous pretraining and introduced benchmark evaluation during training with datasets like dharma-1 on HuggingFace, recommending to set do_bench_eval: true and suggesting bench_dataset: pharaouk/dharma-1/dharma_1_full.json for relative performance improvement checks.

  • Documentation and Configuration Tips for Axolotl Users: Points were noted on the absence of certain parameters (e.g., bench_dataset) in the main README of axolotl, indicating potential areas for documentation improvement. The discussions underpin the benefits of these configurations for test runs and the iterative nature of model development within the community.

Links mentioned:


OpenAccess AI Collective (axolotl) ā–· #datasets (7 messages):

  • New DPO Dataset Announced: dangfutures shared a link to a new dataset on Hugging Face, used for training the Snorkel model with emphasis on not using external LLM responses, but only prompts from UltraFeedback. The methodology involves generating 5 response variations for each prompt, using LLM for response reranking, and applying Direct Preference Optimization (DPO).

  • Mistral 7 gets a Tune-Up: dangfutures expressed enthusiasm about Mistral 7 being fine-tuned, suggesting significant improvements or updates to the model.

  • ALPCA Numbers Discussed: dangfutures initially mentioned a numerical figure in relation to ALPCA, later clarifying it to be a 34 percent figure, which is deemed as an improvement over old GPT-4 metrics.

  • Reaction to Discussion: _dampf responded to the ongoing discussion with a GIF from Tenor, implying a humorous or expressive reaction to the shared information about datasets and model performance.

Links mentioned:


OpenAccess AI Collective (axolotl) ā–· #rlhf (2 messages):

  • Seeking Clarity on DPO Training Plots: User noobmaster29 asked if there are any resources available for understanding the dpo training plots, implying a need for documentation or guides to interpret data from dpo training sessions.
  • Troubleshooting DPO Dataset Issues: noobmaster29 inquired about the necessary components of a dpo dataset, specifically if anything beyond prompt/input and a chosen rejected pair is required. They mentioned having issues with dataset processing despite including these columns.

OpenAccess AI Collective (axolotl) ā–· #community-showcase (1 messages):

pradeep1148: https://www.youtube.com/watch?v=wlPxEq_Mtkc


LM Studio ā–· #šŸ’¬-general (118 messagesšŸ”„šŸ”„):

  • Proxy Support Inquiry and Frontier Exploration: Users expressed a need for proxy settings in LM Studio due to inability to use the model search function, particularly @laooopooo_02864 mentioned this limitation in regions where HuggingFace is blocked. Wildcat_aurora shared a GitHub link to LM_Chat_TTS_FrontEnd, facilitating interface interaction with LM Studio models through text-to-speech functionality from a mobile device.

  • LM Studio and Model Loading Challenges: Various users encountered issues with LM Studio, such as @xmiruso and @bagua facing challenges with GPU detection and model loading errors, respectively. Solutions ranged from ensuring GPU offload settings to switching to alternative model versions.

  • Interactive Learning Environments in the Classroom: @fabguy provided insights on setting up LM Studio for classroom use, suggesting the necessity of a web interface on a server and an OpenAI-compatible client for student interaction. They also recommended checking out an Awesome LLM Web UI GitHub repository for frontend options.

  • Parallel Model Operations and Support Concerns: @therealtrebor inquired about running multiple models in parallel, revealing through testing that it’s beneficial for specific use cases, while heyitsyorkie indicated this requires running multiple LM Studio instances. Users, including @Leo - Moon Boyz šŸš€šŸš€šŸ›ø, sought ways to contact support for troubleshooting, directed to use specific error-reporting channels.

  • Silly Tavern Misconceptions and Data Privacy Discussions: The community cleared up misconceptions about Silly Tavern costing money, with @ptable and @technot80 confirming its free status. Discussions on ensuring LM Studio’s data privacy involved suggestions like operating in a docker or VM environment with no external internet access, with @agcobra1 seeking alternative assurances and @bigdave7541 suggesting the use of wireshark for validation.

Links mentioned:


LM Studio ā–· #šŸ¤–-models-discussion-chat (54 messagesšŸ”„):

  • C++ Redistributables Solve Model Loading Error: @rparada encountered repeated errors when loading various models like Stable Code, Deepseek, and Codellama. The issue was resolved following @heyitsyorkie’s advice to update C++ redistributables, demonstrating the importance of keeping system components up-to-date for model compatibility.

  • No Multimodal Model Rivals GPT-4 Yet: In response to @mudf00t asking for a multimodal model comparable to GPT-4, @heyitsyorkie clarified that there are no comprehensive models yet that combine vision and text generation, highlighting the current limitations in the AI model landscape.

  • Azure OpenAI Service for GPT-4 Usage Discussed: @mickael6102 shared their company’s use of GPT-4 via Azure’s OpenAI service, sparking a conversation on cloud versus local model deployment led by @vbwyrde. Concerns around data privacy, costs, and dependency on third-party cloud services were major discussion points.

  • InternLM Sparks Interest: @vbwyrde highlighted a new model called InternLM, noting its claims of an 11 percent better reasoning ability than GPT-4 and advanced function-calling capabilities. This incited interest as an alternative to other models, with @mickael6102 considering it for future projects.

  • Debate Over Corporate Open Source Contributions: Discussion led by @vbwyrde, @.gumdro, and @fabguy pondered the motivations behind Meta’s release of LLaMA2, speculating on strategic benefits such as creating standards and undercutting competitors. This conversation raises questions about the relationship between open source contributions and corporate strategy in the AI field.

Links mentioned:


LM Studio ā–· #🧠-feedback (4 messages):

  • Bug Alert in MoE Model Setting: @msz_mgs reported a bug where changing from a 4X MoE model to a 2X MoE model with num_experts_used=4 generates an error message that prompts a restart of the application. This seems to cause an inability to adjust the setting without restarting.
  • Investigation Initiated on MoE Model Issue: In response to @msz_mgs’s report, @yagilb acknowledged the problem and requested more information about the 2x MoE and 4x MoE models involved to aid in resolving the issue.
  • Clarification Offered on MoE Model Configuration: @dagbs provided a tip suggesting that for a 4X MoE model, the number of experts intended to be set is 2, offering insight into possibly intended configurations.
  • Performance Slowdown on 0.2.11 Update: @golangorgohome experiences significant performance issues with the 0.2.11 update on Windows 11, including delayed search icon response and slow search results, despite having a high-speed internet connection.

LM Studio ā–· #šŸŽ›-hardware-discussion (11 messagesšŸ”„):

  • GPU Preferences for Best Price/RAM Discussed: @gitanos inquires about the efficiency and value of a 4060ti with 16 GB RAM, questioning if it’s the best choice regarding price for RAM at the moment. @heyitsyorkie suggests considering a used 3090 on eBay instead, as it’s only slightly pricier in the UK and may offer better value.
  • Technical Glitches with e0.211: @madan.pandit reports that models previously operational have ceased to function, specifically questioning others’ experiences with the e0.211 updates. It was noted by @heyitsyorkie that GGML models have been deprecated in favor of llama.cpp in the latest builds, affecting their compatibility.
  • Memory Issues with Gguf Models Reported: @madan.pandit mentions encountering memory insufficiency errors with gguf models, indicating potential compatibility or resource allocation issues following recent updates.
  • Mac Studio as a Recommended Hardware for LLMs: @heyitsyorkie recommends purchasing a fully loaded M2 Mac Studio for running large language models (LLMs), pointing out its compact size, efficiency, and design aesthetics.
  • Debate Over P40 and M40 GPUs for Multi-GPU Setups: In a discussion about using P40 and M40 GPUs in conjunction, @docorange88 inquires about their collective performance, while @wildcat_aurora and @rugg0064 suggest P40s are a good investment, but deem M40s not worthwhile, highlighting plans to acquire P40s soon.

LM Studio ā–· #🧪-beta-releases-chat (47 messagesšŸ”„):

  • Choosing the Right LLM is an Art: @mmonir outlines criteria for selecting the optimal Large Language Model (LLM), emphasizing the importance of use cases like writing, coding, and summarizing, along with considerations such as quantization, hardware capabilities, and user experience. They also suggest leveraging leaderboards and LLM demos for a more informed decision, and to seek user feedback from various platforms.

  • VRAM Misread on New Install: @mudf00t reports an issue where LM Studio is displaying 0 VRAM for an RTX 3090 on a new Nobara installation, to which @yagilb provides a workaround link specifically directed at Nvidia users, not applicable for Mac M2 silicon.

  • Troubleshooting Version Compatibility: @pdg faces issues with models not working with the final new version of LM Studio, experiencing unusual system behavior on their MacBook M2. They receive assistance from @yagilb including a link to revert to a previous version, 0.2.10, hinting at potential software regression issues.

  • Model Loading Woes and Workarounds: After downgrading LM Studio versions, @pdg continues to encounter errors with model loading but finds a temporary fix by initiating interactions with shorter sentences. This workaround allows for longer inputs to be processed successfully later on.

  • Context Length and Overflow Policy Insights: @mattjcly_55150 inquires about the context length settings @pdg might have used, suggesting adjustments in context overflow policies could resolve issues related to input length limitations. This implies that initial input length exceeding the set context length could be a root cause of the encountered errors.


LM Studio ā–· #autogen (1 messages):

  • Broken Pin Causes Tears: sunglasses.emoji reported that the pinned link for empty strings in autogen studio is broken, leading to a cry for help. They seek guidance on how to do a custom agent class in Autogen Studio.

LM Studio ā–· #open-interpreter (10 messagesšŸ”„):

  • Consistent Framework Issues with Model Utility: @pefortin highlights issues with multiple AI frameworks, including Open Interpreter, memGPT, and crewai, specifically regarding the inability of models to appropriately use available tools. They mention using mid to large models like mixtral8x7B and deepseek coder 33B, dismissing size as the problem.

  • Adventures in Model Testing with RTX 3090: @mudf00t shares they’re testing various models, leveraging the power of an RTX 3090 to handle larger models, pointing to the direct experience with hardware advantages in model experimentation.

  • API Knowledge Gap Frustrates Development: @mudf00t expresses frustration over models, including those provided by OpenAI, not being up-to-date with the current API, leading to context issues during app development processes.

  • Integrations and Fine-Tuning on the Horizon: @222gate notes the discontinuation of memgpt integration and shares plans to fine-tune a Mistral model for specific function calls, suggesting this approach may resolve some operative issues.

  • Mistral’s Creative Hallucinations: @mudf00t observes Mistral creating fictitious directory structures and code, underlining the model’s tendency to hallucinate rather complex outputs beyond the expected operational tasks.


Mistral ā–· #general (163 messagesšŸ”„šŸ”„):

  • GPU Rentals for AI Summarization Efforts: @mrdragonfox highlighted that services like runpod, vast, and lambda offer GPUs by the hour for rental, suggesting users leverage these for their AI work. They also mentioned Kaggle as a free resource offering 2 T4 GPUs for 30 hours per week.

  • Subscription Issues and Support Contacts: @glorfsf experienced issues with changing the usage limit options in their subscription, which @mrdragonfox identified as a default limit issue and advised contacting [email protected] for assistance.

  • Mistral’s Use Cases and Integration with Other Frameworks: In a discussion on how Mistral models are utilized, @ethux shared use cases around finetuning for customer support automation and mentioned an interesting Embeddings model, showcasing the diverse applications of Mistral models beyond just competing with GPT-4.

  • Exploring Mistral’s API Limitations and Enhancements: Users discussed difficulties related to API token costs, tokenization questions, and specific AI model performance, with @mrdragonfox and others providing insights on how models work and proposing solutions for better model utilization.

  • Technical Debates on Model Selection and Optimization: An engaging conversation emerged around the effectiveness of GitHub Copilot, with @mrdragonfox and @i_am_dom debating its underlying technology, showcasing the community’s deep involvement in understanding and optimizing AI model performance.

Links mentioned:


Mistral ā–· #ref-implem (9 messagesšŸ”„):

  • Clarifying Purpose between Training and Finetuning: @ethux queries if a discussion is about training or finetuning, indicating the significance of understanding the context when considering system requirements.
  • Memory Requirements for Datasets: @ethux discusses that while specific requirements are unclear, 64GB of memory should suffice for unknown dataset sizes, pointing to the importance of understanding memory needs in relation to dataset size.
  • Memory Footprint for Mistral’s 4bit Inference: According to @ethux, Mistral requires at least 26GB for 4bit inference, signaling a significant memory requirement for inference tasks.
  • Optimization Tips for Model Efficiency: @mrdragonfox compares exllama and a regular 4 bit transformer, suggesting that Exllama is more efficient in terms of memory usage, therefore highlighting different optimization strategies.
  • Quantization Strategies for Reducing Memory Footprint: @mrdragonfox proposes using exllamav2 as a quantization strategy over the BnB 4 bit approach, suggesting it as a better alternative for memory efficiency.

Mistral ā–· #finetuning (3 messages):

  • Translation Metrics Inadequate for LLM Evaluation: @adrienbufort states that BLEU and Rouge metrics, commonly used for translation performance evaluation, are not useful for evaluating large language models (LLMs) or instruction tuning LLMs.

  • ELO-like Evaluation Tops for LLM: @adrienbufort recommends ELO-like evaluation system, akin to chess rankings, as being closest to human preference for LLM evaluation, although it requires human input. They describe it as the best available method.

  • Multiple Choice and LLM Evaluation Techniques: They also mention MMLU (Multiple Choice Question Evaluation) and Alpaca Eval (Alpaca Evaluation) as other methods for LLM evaluation. MMLU allows for clear right or wrong answers, while Alpaca Eval involves one LLM evaluating another’s answers.

  • Normalized Alpaca Eval Available: @akshay_1 mentions that a normalized version of Alpaca Eval is now available on the market, indicating advancements in methods for LLM evaluation.


Mistral ā–· #showcase (1 messages):

  • Innovative AI Browser Queries Introduced: User @sublimatorniq showcased a tool that allows AI to respond with DOM node references, making browser queries more context-aware. This tool is compatible with MistralAI and aims to enhance the interaction with web content.

Links mentioned:

no title found: no description found


Mistral ā–· #random (8 messagesšŸ”„):

  • New Member Seeks RAG Application Insight: @xrdg, a newly joined member from šŸ‡¬šŸ‡¹, raised an inquiry about prompt structure for their RAG application and sought for a dedicated support channel.
  • DSPy Suggested for Prompt Optimization: @akshay_1 recommended DSPy as a tool to optimize the prompt structure, which was appreciated by @xrdg.
  • Details on RAG Stack Provided: Upon further query by @akshay_1, @xrdg disclosed their RAG stack consists of langchain, chroma, and Mistral 7B and shared a link to a guide on prompting Mistral 7B with examples, tips, and relevant reading materials.
  • RAG Stack Potential for Optimization Highlighted: @akshay_1 observed that @xrdg’s RAG stack could be further optimized, inquiring if the project was a hobby project or planned for production.

Links mentioned:

Prompt Engineering Guide: A Comprehensive Overview of Prompt Engineering


Mistral ā–· #la-plateforme (35 messagesšŸ”„):

  • Early Stopping Predicament Persists: Users @digitalphotographer and @sublimatorniq exchanged experiences on an early stopping issue with Mistral, noting the problem occurs even without control tokens in square brackets. @digitalphotographer further clarified that their prompts are plain strings, lacking control sequences or special characters.

  • Technical Support Suggested for Early Stopping: @mrdragonfox advised @digitalphotographer to contact Mistral support ([email protected]) and provide logs/full examples for further assistance. @sophiamyang also offered help, asking for reproducible examples for the team to investigate.

  • Billing Page Bug Detected: User @ewanhc and others discovered a bug where monthly usage limits on the billing page revert to €150 upon refresh, despite attempts to set a different limit. @ethux and @fersingb confirmed experiencing the same issue, and @sophiamyang advised contacting support with details.

  • Mistral API Hosting Location Inquiry: @loicboutet inquired about where the Mistral API is hosted. It was clarified by @mrdragonfox and further confirmed by @loicboutet through the privacy page that the hosting is done in Europe, specifically on Azure in Sweden.

  • Bug Identified with ā€œmax_tokensā€ Parameter: @mrxavierx reported a bug with the Mistral API endpoint /completion when the max_tokens field is set to 1, which leads to a 500 internal server error rather than a 1 token response or a meaningful validation error. An issue was created for this on GitHub: BUG: API /completion endpoint returns 500 (server error) when sending ā€œmax_tokenā€ = 1.

Links mentioned:


Eleuther ā–· #general (29 messagesšŸ”„):

  • Seeking Fine-tuning Code for Meta’s SAM: @the_alt_man asked for a fine-tuning codebase for Meta’s SAM (Segment Anything) model but ended up using AutoGluon due to the lack of such functionality in the original codebase. They highlighted that AutoGluon uses Lightning, which works for GPUs but not for TPUs.

  • Exploring Multi-node Training Without Infiniband: @elyxlz inquired about the feasibility of multi-node training without Infiniband, contemplating a merging strategy every few training steps. The discussion led to a shared paper on distributed optimization (DiLoCo) that potentially addresses this concern.

  • The Pile Dataset Access Quest: @sk5544 sought guidance on accessing The Pile dataset, being directed by @stellaathena and @elyxlz through private discussions and links, which highlights a community ready to support information sharing.

  • Analog Clock Dataset Challenge: @stellaathena proposed a creative project to @pinconefish, suggesting the creation of a dataset featuring analog clocks showing diverse times for a study in out-of-domain generalization. The particularity of the dataset focusing on clocks showing 10:10 piqued interest as a starting point for training a text-to-image (T2I) model.

  • Exploring Training Quality and Data Defects Through Synthetic Datasets: Following the analog clock discussion, @wonkothesensible proposed exploring potential models’ mode collapses through checkpoints analysis, suggesting the creation of a synthetic dataset of clocks showing various times for broader training and defect identification purposes.

Links mentioned:

DiLoCo: Distributed Low-Communication Training of Language Models: Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of tightly interconnected acc…


Eleuther ā–· #research (125 messagesšŸ”„šŸ”„):

  • Byte-Level Transformers Spark Debate: A discussion initiated by @main.ai regarding the efficiency of byte-level transformers like ByT5, sparked further conversation with @the_random_lurker about unfair comparisons in sequence length and the real-world benefits of additional context in models like Mamba. The topic touched upon in the MambaByte paper transitioned into the effectiveness and fairness of comparing token to byte sequence lengths.

  • Exploring Proxy-Tuning for LLMs: @digthatdata shared a new approach called proxy-tuning which proposes a lightweight decoding-time algorithm designed to tune large language models (LLMs) using a smaller model as a proxy. This method aims to close the performance gap between directly tuned models and their black-box counterparts without direct access to model weights.

  • Mamba’s Rejection Raises Eyebrows: Discussion around the rejection of the Mamba paper with scores of 8/8/6/3 out of 10 led @stellaathena and others to critique the meta review process. The rejection has sparked dialogue about the implications for the academic community and the pursuit of state of the art (SOTA) benchmarks.

  • Chess Engine Superiority and AI Advances: Talks revolved around the current SOTA in various fields, including chess where @clockrelativity2003 noted Stockfish remains a top contender. @stellaathena challenged members to name recent SOTAs across multiple domains, highlighting gaps in awareness within the ML community regarding advancements.

  • AI-Driven Evolution of Chess: A blog post discussed by @clockrelativity2003 imagines chess in 2024, redefined by AI advancements, featuring new engines like Pandora and augmented by digital tools for strategic enhancements. This conversation piece reflects on the interface between traditional games and next-gen AI technology.

Links mentioned:


Eleuther ā–· #lm-thunderdome (2 messages):

  • Possible Merge for a Fix: @hailey_schoelkopf mentioned considering merging a fix after discovering surprising behavior in their implementation. They plan to try this out when they have a chance.
  • Adding Weights and Biases Support to lm-evaluation-harness: @hailey_schoelkopf shared a GitHub pull request #1339 by @ayulockin for adding Weights and Biases support to the lm-evaluation-harness, questioning the best location for the wandb.py file.

Links mentioned:

feat: Add Weights and Biases support by ayulockin Ā· Pull Request #1339 Ā· EleutherAI/lm-evaluation-harness: In #359 @parambharat did proposed to add support for W&B logging. However it was done before the big refactor that got in. As a user of both lm-evaluation-harness and wandb, I have opened this PR …


Eleuther ā–· #gpt-neox-dev (16 messagesšŸ”„):

  • QLoRA Tuning Confusion Cleared: @kenakafrosty inquired about issues tuning neoX 20b with QLoRA but was informed by @stellaathena that the GPT-NeoX library does not support QLoRA. Kenakafrosty was actually using a combination of trl, transformers, and peft libraries.
  • Clarifying the Right Help Channels: Following @stellaathena’s advice, @kenakafrosty realized the mistake and acknowledged the confusion. Stellaathena suggested opening an issue on the relevant GitHub for help with those libraries.
  • Catboy_slim_ Addresses Testing Challenges: @catboy_slim_ discussed the challenges of testing substantial updates like Python, PyTorch, and CUDA versions. Highlighted the inability to manually test every branch, emphasizing the need for functional tests.
  • Seeking Solutions for PyTorch Testing Issues: In addressing the challenges with running pytest on torch code, @catboy_slim_ opened an issue on GitHub regarding tests failing when run with pytest —forked, suggesting this might be a widespread issue beyond their control.
  • Efforts to Facilitate Validation: @tastybucketofrice offered compute access to <@337128969059172353> and invited @catboy_slim_ to DM for access as well, aiming to support further testing of the changes made. Additionally, responded to concerns about pytest issues by suggesting DeepSpeed as a potential model for successfully integrating CUDA with pytest in forked processes.

Links mentioned:

Tests fail when run with pytest —forked Ā· Issue #1132 Ā· EleutherAI/gpt-neox: Describe the bug When tests are run with pytest —forked per the instructions in /test/README.md, a large number of tests fail with the error: RuntimeError: Cannot re-initialize CUDA in forked subp…


Perplexity AI ā–· #general (87 messagesšŸ”„šŸ”„):

  • Perplexity Pro’s Exclusive Features Unveiled: Users such as @icelavaman and @mares1317 discussed the benefits of Perplexity Pro over the regular version, highlighting features like practically unlimited Copilot queries, the ability to attach images and files for exploration with models like Claude 2.1 and GPT-4, and access to powerful AI models. This info was supported with a link to the Perplexity Pro FAQ.

  • Privacy Concerns Over Data Retention Stir Discussion: @emisaurus_hex and @firesonwires raised concerns about Perplexity’s privacy policy, specifically the retention of search history and personal data until account deletion. Expert @icelavaman clarified that deleted threads are removed from servers after 30 days, yet users remain cautious about the implications for privacy.

  • Clarification Sought on Thread Deletion Policy: Amidst confusion over data retention, @icelavaman, a self-identified Perplexity expert, reassured users like @emisaurus_hex that deleted threads are indeed deleted from servers after 30 days. However, users expressed a desire for clearer privacy policies on the Perplexity website.

  • Debating the Ethics of Data Storage: The discussion turned philosophical as users like @yellephen pondered the value of aggregated search data, while others like @firesonwires expressed discomfort with the potential for invasive profiling based on search history. This conversation underscores the complexity of privacy in the digital age.

  • Technical Support Queries Engage Community: Users reached out for support on diverse topics, including recovering old bookmarked threads (@skyhunz), file upload limits for Pro users (@lukas8a), and applying Perplexity Pro credit codes (@odobostudio). The community and Perplexity representatives like @icelavaman and @danielagmz888 provided guidance, demonstrating the active engagement of the support team within the platform.

Links mentioned:

  • What data does Perplexity collect about me?: Explore Perplexity’s blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
  • What is Perplexity Pro?: Explore Perplexity’s blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
  • Perplexity Blog: Explore Perplexity’s blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
  • Perplexity - AI Companion): Ask anything while you browse

Perplexity AI ā–· #sharing (4 messages):

  • Perplexity Bridges Google Search and OpenAI: jsudaniel highlighted the CEO of Perplexity’s unique position, merging the capabilities of Google Search and OpenAI. They shared a YouTube video titled ā€œI use Perplexity MORE than Google and ChatGPTā€, shedding light on why Perplexity is preferred for content creation over other tools.
  • Perplexity as a Learning Tool for Smartsheet: nicknalbach shared their experience of using Perplexity for learning Smartsheet, mentioning that despite being well-versed in Excel, transitioning to Smartsheet was challenging. They found Perplexity exceptionally helpful in providing answers to every problem encountered, aiding in building their Smartsheet.
  • Astronomy Teaching Aide via Perplexity: coloradocomplex noted using Perplexity to help explain concepts in their astronomy class, showcasing Perplexity’s utility in the educational sector. They also shared a link to a specific concept explanation on Perplexity, although the specific concept discussed was not mentioned.

Links mentioned:

I use Perplexity MORE than Google and ChatGPT: Main Takaways From this Video: ā€œI use Perplexity more than ChatGPT, BARD, and Microsoft Copilots for five main reasons, including its use in content creation…


Perplexity AI ā–· #pplx-api (5 messages):

  • Website version outperforms API: User @benhirap mentioned that the website version of a certain tool codes significantly better compared to its API counterpart.

  • Curiosity about labs vs. API differences: @stijntratsaert_01927 is looking for information on the default parameters used by labs to achieve similar results when using the API.

  • Double charge issue goes unresolved: @aiagileguy detailed an experience with [email protected] regarding a double charge. Despite reaching out more than 1-2 business days ago, they have yet to receive a resolution.


HuggingFace ā–· #announcements (3 messages):

  • Community Highlights #42 Echoes Interest: @lunarflu pointed out the increase in content creators and posts on HuggingFace, highlighting the platform’s focus on Machine Learning and its quieter nature compared to Twitter or LinkedIn. They mentioned the possibility of adding more members to their organization, implying an invitation to those interested in AI and ML. Join the org and explore more.
  • Lightweight Open LLM Leaderboard Unveiled: The leaderboard covers various aspects including weight types, precisions, licenses, parameters, architectures, and even specifies exclusions like merged, flagged, or MoE models. Dive into the cosmos-arena.
  • LM Interpretability Survey Highlights: H. Luo and L. Specia’s survey on explainability for LLMs categorizes current approaches and discusses their applications, signaling a significant stride towards understanding and leveraging the interpretability of pre-trained Transformer-based LMs. Read the full post.
  • Innovative Projects by Content Creators: From the CheXRay space for testing the /StanfordAIMI/CheXagent-8b model to generating image embeddings for datasets, the content showcases practical utilities and advancements in the field. Similarly, projects like ZeroGPU for the whisperspeech model, a Python module for adding steering vectors, and the Text2SQL model for DuckDB exemplify the diverse explorations within the community.
  • AI Alignment with TenyxChat-8x7B-v1: Introduced as a part of the TenyxChat series, this model represents a fusion of preference tuning and advanced fine-tuning techniques to serve as a useful assistant, sporting a score of 8.4 on MT-Bench. Explore TenyxChat.

Links mentioned:

I launched my first competition !

Goal : Use AI toā€¦ā€](https://huggingface.co/posts/Tonic/783827682062088): no description found

Well, yes, if the models areā€¦ā€](https://huggingface.co/posts/vicgalle/320544784279721): no description found


HuggingFace ā–· #general (40 messagesšŸ”„):

  • GPU Choices and Model Pretraining Realities: @sachinkannaujiya1998 seeks advice on which Hugging Face model to use with an RTX 3080 GPU, while @b1gb4ng contemplates pretraining a 7b parameter model, only to reconsider due to the significant resources required as detailed by @asprtnl_50418. @asprtnl_50418 suggests fine-tuning existing models as a cost-effective alternative, highlighting LoRA/QLoRA adapters and Unsloth for efficiency.

  • Feature Extraction Techniques Explained: @vipitis clarifies that feature extraction often involves creating sequence embeddings with encoder-only models like BERT and its derivatives. The MTEB leaderboard is suggested for exploring state-of-the-art models.

  • LLM Training and Eval Data Strategy: @enka55 raises a practical question about splitting new, unique data between training and evaluation sets when teaching a language model about ā€œsoldering,ā€ sparking advice from @the_aureo on ensuring data diversity and considering models like RAG to supplement learning.

  • Inquiry on HuggingFace’s Free Plan and UI Integrations: @Ali_k inquires about the capabilities of Hugging Face’s free plan concerning AI model training. Meanwhile, @Sebastian seeks experiences with chat-ui for integrating text-generation WebUI endpoints.

  • Seeking Model Reviews Beyond Benchmarks: @green_eye voices frustration over navigating model choices based on benchmarks alone, expressing a desire for reviews that provide insight into model strengths, weaknesses, and contextual performance beyond mere numbers.

Links mentioned:


HuggingFace ā–· #today-im-learning (1 messages):

  • The Quest for a Data Set Evaluation Framework: @rosebei3ngan3g discussed the lack of comprehensive frameworks for evaluating datasets used in training large language models, questioning the best approach to take for such evaluations. This highlights a gap in current methodologies focusing mainly on model evaluation.

HuggingFace ā–· #cool-finds (7 messages):

  • Diving Deep into HuggingFace’s Dataset Cards: @andysingal shared an intriguing GitHub project analyzing HuggingFace dataset cards. The project aims for a large-scale analysis of dataset documentations, contributing to the AI community’s understanding of dataset narratives and usage.

  • Learning UD and Audio ML Adventures: @pacificvoltage embarked on a self-tutorial journey with an introductory book on universal design (UD) learning and found a fascinating Machine Learning Street Talk interview with Chomsky on YouTube, discussing the utilization of Deep Fake technology for interview repair.

  • Novel LLM Text Detector Unveiled: @tea3200 introduced a groundbreaking preprint titled ā€œBinocularsā€, which proposes a new large language model (LLM) detector that achieves impressive accuracy at identifying machine-generated text through contrast scoring. This technique showcases over 90% detection rate of generated samples at a negligible false positive rate.

  • Flutter Pushes Into AI with ONNX: @akindelemichael highlighted a notable GitHub repository aiming to bridge ONNX runtime with Flutter, enabling the integration of ONNX models in Flutter apps across various platforms. This repo complements a growing trend of Flutter’s use in AI applications, as noted by @osanseviero in a follow-up.

  • Flutter SDK for HuggingFace Inference APIs: Following the mention of Flutter’s expanding role in AI, @osanseviero detailed a new project, the Flutter SDK for HuggingFace Inference APIs, which supports NLP APIs and underscores Flutter’s potential for cross-platform AI application development. This development marks a significant step towards making AI more accessible and open source.

Links mentioned:


HuggingFace ā–· #i-made-this (12 messagesšŸ”„):

  • WhisperSpeech launches multilingual TTS on HuggingFace: @tonic_1 shared a new HuggingFace Space for WhisperSpeech, a demo allowing multi-language text to speech and voice print creation with minimal audio input. Expect more examples to arrive soon in this draft version.
  • Nemo Model Project Blog Post in the Works: Following @tonic_1’s interest in launching a Nemo model project, @not_lain committed to writing a detailed blog post ASAP. This will cover using containers, providing a much-needed example with details for the community.
  • CheXRay Demos Facing a Runtime Error: @tonic_1 reported a runtime error in their HuggingFace Space CheXRay, indicating work in progress on analyzing Chest X-Rays. The error highlights ongoing development in medical imaging AI applications.
  • Call for Increased Outreach via Community Blog Posts: @lunarflu suggests that sharing work through HuggingFace community blog posts could help in increasing reach, pointing @mateomd_dev towards a potential opportunity for detailed technical sharing. The conversation hints at the growing importance of detailed community-driven content on HuggingFace.
  • Anticipation Builds for wav2vec2-bert Model Release: @yehors announced the publication of a new model, wav2vec2-bert, based on Common Voice 10, set to release tomorrow. This model promises enhancements in voice-based AI technologies.

Links mentioned:


HuggingFace ā–· #reading-group (3 messages):

  • Google’s Lumiere Boasts Impressive T2V Capabilities: @fishie22 highlights Google’s Lumiere for its groundbreaking approach in T2V (Text-to-Video) models. Utilizing a Space-Time UNET that downsampled the signal on both space and time, Lumiere is capable of generating 80 frames at 16fps, presenting a potentially unparalleled temporal consistency in video generation. Here’s the comprehensive study detailing their method and results: Read the study.

  • No Rush, Take Your Time!: @lunarflu conveys a message of patience and support towards Isamu, emphasizing a non-pressured, encouraging community vibe.

  • Medium Article Earns Praise for Benchmarking: @starsupernova shared their enthusiasm for a Medium article related to benchmarking, describing it as ā€œSuper greatā€. Details on the contents of the article or its particular focus were not provided.

Links mentioned:

Lumiere: A Space-Time Diffusion Model for Video Generation: We introduce Lumiere — a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion — a pivotal challenge in video synthesis. To this end, we …


HuggingFace ā–· #diffusion-discussions (1 messages):

spikespiegel5112: How to load LoRA model in local?


HuggingFace ā–· #computer-vision (5 messages):

  • Exploring Advanced LMM Techniques: User besiktas inquired whether there were specific reasons for choosing idefics/flamingo resampler/cross-attention over linear projection/pretrained vision encoder in the training of a new LMM. No direct response was provided in the chat.

  • Swift API for Vision AI Introduced: ahmed3ibrahim shared his experience using the Swift API for vision AI, noting its ability to handle multiple images in one request. He provided a link to the API: Gemini Pro Vision AI and highlighted its features including 8.5/10 popularity, 5,846ms latency, 99% service level, and 100% health check.

  • Inquiry About CVPR2024 Submissions: User iloveh8 asked how to access all papers (both accepted and rejected) submitted for CVPR2024, and also inquired if anyone in the chat had submitted a paper. There were no responses provided in the chat regarding this question.

Links mentioned:

Gemini Pro Vision AI API Documentation (swift-api-swift-api-default) | RapidAPI: no description found


HuggingFace ā–· #NLP (15 messagesšŸ”„):

  • TorToiSe TTS Achieves New Heights in Quality: @mr_nilq highlighted TorToiSe TTS, available via Coqui, for its exceptional quality and consistency, despite being hindered by slow performance due to its composite architecture. A modified version achieving a 5x speed increase can be found here.
  • Choosing the Right Tools for Training AI on QA Binaries: @ysk.dev is exploring options for training an AI with approximately 10,000 question-and-answer pairs, contemplating between Amazon Lex and VDS. Concerns were raised about the adequacy of Colab Pro Plus for handling long answers and queries about suitable machine specifications for running the backend server.
  • Troubleshooting TFTrainer ImportError in Transformers: @srovnbh faced an ImportError with TFTrainer from the transformers package. Attempts to resolve the issue by switching between versions 4.36.2 and 4.37.1 of transformers proved futile, even with community assistance.
  • Upcoming Talk on Trusting ā€œBlack Boxā€ Models: @vipitis shared a link to an upcoming talk about the trustworthiness of evaluating ā€œblack boxā€ models, which are not openly accessible for inspection. The talk details are available here.
  • Compatibility Issues with Bits and Bytes on Windows: @kingpoki encountered issues using certain functionalities, ultimately discovering that the root cause was incompatibility with Windows. This serves as a reminder of the operating system’s impact on software applicability and functionality.

Links mentioned:

talks.cam : Replicating and auditing black-box Language Models.: no description found


HuggingFace ā–· #diffusion-discussions (1 messages):

spikespiegel5112: How to load LoRA model in local?


HuggingFace ā–· #gradio-announcements (1 messages):

  • Gradio 4.16 Released with Exciting Features: @abidlabs announced the release of gradio 4.16, highlighting major updates including native support for Polars Dataframe, the ability to use the Gallery component as an input, faster streaming for low-latency chatbots, and auto-generated docs for custom components. This major update aims at enhancing the Gradio experience, and more details can be found in the changelog.

Links mentioned:

gradio/CHANGELOG.md at main · gradio-app/gradio: Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work! - gradio-app/gradio


LAION ā–· #general (47 messagesšŸ”„):

  • LAION2b-en Aesthetics Scores Unavailable: @ppwwyyxx asked where to download the LAION2b-en aesthetics scores, linking to a Hugging Face dataset that was disabled on the author’s request. @chad_in_the_house confirmed the dataset is currently down and advised checking announcements for further updates.
  • The Quest for a Better Open Source Voice Chat Interface: @jpcl_ shared news about releasing a demo for a complete voice chat interface utilizing Whisper and WhisperSpeech alongside an Open Source LLM, aiming for lower latency and a more natural conversation flow. They expressed a desire to improve the current LLM used (Dolphin 2.6 Phi-2) and invited collaboration to enhance the project, with details posted on Hacker News.
  • Looking for Teammates for VA’s AI Tech Sprint: @ninjaa2377 is seeking individuals or a small team to join for the VA’s AI Tech Sprint, with the challenge focusing on Ambient Dictation for Clinical Encounter Notes, offering a $300K first prize. The competition aims to push AI capabilities in healthcare, and participants must be U.S. persons (AI Tech Sprint Challenge).
  • Debate Over Copyright Infringement and AI Models: @pseudoterminalx discussed operating in a copyright infringement haven, critiquing entities that underestimate local government autonomy and resilience to foreign influence. They shared insights into local practices, including the use of pirated U.S. channels by cable companies, without mentioning a specific location.
  • Discussion on Utilization of LLMs Beyond Entertainment: @SegmentationFault lamented the prevalent use of local LLMs for entertainment rather than productive means, sparking a discussion on the dominance of OpenAI and the desire for more practical applications. The conversation touched on historical human behavior and the future of LLM usage.

Links mentioned:


LAION ā–· #research (38 messagesšŸ”„):

  • Byte-Level Transformers Garner Optimism: @marianbasti expressed cautious optimism for byte-level transformers after reviewing a research paper, potentially indicating a significant shift in transformer model capabilities.
  • Innovations in Text-to-Image Diffusion and ID Preservation: @vrus0188 shared two groundbreaking projects: RPG-DiffusionMaster for mastering text-to-image diffusion and InstantID for state-of-the-art tuning-free ID-Preserving generation, showcasing rapid advancements in AI image generation techniques.
  • AI Image Generation Far From Limits: According to @vrus0188, the weekly emergence of 4-6 new papers or tools demonstrates that AI image generation technology is nowhere near its theoretical limits, debunking any skepticism regarding the field’s potential for growth.
  • Challenges and Costs in Entering AI Fields: @chad_in_the_house and others discussed the evolving barriers to entry in fields like AI image generation, from the practicality of undertaking such ventures to the significant costs associated with training models such as Stable Diffusion, with estimates ranging between €500 to €180,000.
  • Exploration of Biologically-Inspired Simulated Language Acquisition: A recent paper discussed by @chad_in_the_house presents a biologically plausible language organ model that learns language without backpropagation, signifying potential shifts away from traditional machine learning approaches toward more efficient and biologically-inspired methodologies.

Links mentioned:


LlamaIndex ā–· #announcements (1 messages):

  • Webinar Alert on LLMCompiler: @jerryjliu0 announces a last-minute webinar with authors Sehoon Kim and Amir Gholami discussing the LLMCompiler, a new framework for parallel function calling in agents. The webinar aims to cover the benefits of this compiler over previous sequential reasoning frameworks like ReAct, emphasizing long-term planning and parallelization capabilities. Links to the LLMCompiler paper and related resources were shared: LLMCompiler paper, LlamaPack, and a GitHub notebook.

Links mentioned:

LlamaIndex Webinar: Efficient Parallel Function Calling Agents with LLMCompiler Ā· Zoom Ā· Luma: LLMs are great at reasoning and taking actions. But previous frameworks for agentic reasoning (e.g. ReAct) were primarily focused on sequential reasoning, leading to higher…


LlamaIndex ā–· #blog (7 messages):

  • Building a Slack Bot with @seldo’s Guide: A new Open Source Software (OSS) repository was announced, featuring a step-by-step guide by @seldo on how to build a @SlackHQ bot that learns from conversations. The guide can be found in a tweet by LlamaIndex.

  • LlamaIndex Partners with Zilliz Universe: LlamaIndex announced a partnership with @zilliz_universe to integrate the Zilliz Cloud Pipeline into LlamaIndex, offering a scalable retrieval service with multi-tenancy support. Read more in their guest post.

  • Day 0 Support for @OpenAI’s Embedding Models Announced: LlamaIndex released version v0.9.38 with Day 0 support for @OpenAI’s latest embedding models. Details about this release were shared in a tweet.

  • LlamaIndex Promises Effortless Prompting: LlamaIndex emphasizes their feature of providing effective prompting so users don’t have to struggle with customization, although customization is still an option. Further details can be found in a recent tweet.

  • TypeScript Support with LlamaIndex.TS: A new TypeScript version, LlamaIndex.TS version 0.1.0, was announced with support for @OpenAI’s latest embeddings, thanking @yi_ding for the quick implementation. Additionally, @qdrant_engine support is included in this update as detailed in a follow-up tweet.

Links mentioned:


LlamaIndex ā–· #general (38 messagesšŸ”„):

  • No LLM for TextGenerationInference in LlamaIndex: User @wizboar queried if LlamaIndex has a LLM for the TextGenerationInference server, to which @cheesyfishes confirmed that there isn’t one available currently, suggesting that the langchain tool works with the wrapper instead.

  • Customizing Chat Engine with similarity_top_k: @richard1861 inquired about the possibility of defining a specific parameter (similarity_top_k=3) for the Chat Engine in LlamaIndex, and @whitefang_jr provided a code snippet to demonstrate how to configure it directly in the engine.

  • Implementation Challenges in Insurance Domain Queries: User @lancerninja detailed a complex use case involving query rewriting in the insurance domain when similar terms are used instead of exact matches, while @cheesyfishes suggested using an llm for query rewriting but noted the absence of an offline solution for their specific scenario.

  • Excitement Over OpenAI’s New Embedding Models: @ayfri shared a link to OpenAI’s announcement of new embedding models and API updates, expressing anticipation for upcoming support, to which @cheesyfishes reassured that support would be released shortly, highlighting the community’s enthusiasm for continuous improvements.

  • Guidance on Custom Prompts for Extending Context in Queries: @shri_j sought advice for querying information not included in provided context documents using LlamaIndex and OpenAI, with @cheesyfishes recommending the modification of default prompts to instruct the model to consider or extend beyond the given context, linking to specific documentation for assistance.

Links mentioned:


LlamaIndex ā–· #ai-discussion (5 messages):

  • Zep Enhances Chatbots with Production-grade Tools: User @yoursaviorjesus highlighted Zep, showcasing its capabilities for production-grade chat history memory, vector search, data enrichment, and more. They specifically questioned the efficacy of its entity extraction capabilities.

  • LlamaIndex Versus Amazon Kendra: @zeekg_46676 asked whether LlamaIndex is a vector store or functions more like Amazon Kendra which performs natural language search.

  • LlamaIndex: A Flexible Data Orchestration Tool: In response to @zeekg_46676, @cheesyfishes clarified that LlamaIndex is closer to Amazon Kendra, capable of utilizing any vector store, LLM, or embedding model to manage data ingestion, retrieval, and response synthesis.

  • Self-learning RAG with Automated Knowledge Graph Creation: @chiajy shared a link to a Substack article (Harry Potter and the Self-Learning Knowledge Graph RAG Workflow) detailing a demo from WhyHow.AI showcasing recursive retrieval, automated knowledge graph creation, and memory/multi-hop reasoning using RAG on Harry Potter book chapters. This demo aimed to improve RAG accuracy, reduce time to production, and show one of the first examples of a self-learning RAG.

Links mentioned:


Latent Space ā–· #ai-general-chat (36 messagesšŸ”„):

  • LLM Paper Club Recording Policies: @kbal11 mentioned that the LLM Paper Club sessions have not been recorded to allow participants to share more about their work without fear of internet exposure. Thus, no replay is available for missed sessions.
  • In Search of RAG Evaluation Tools: @joejoej0e is looking for tools for experiment tracking and hand-rating the results of RAG pipelines, aimed at improving information retrieval products by human evaluation of relevance.
  • Introduction of MORPHEUS-1 by Prophetic AI: @shivdinho shared a link announcing MORPHEUS-1, claimed as the world’s first multi-modal generative ultrasonic transformer for inducing and stabilizing lucid dreams, set for beta release in Spring 2024.
  • Rapid Development at go-go-golems: @slono announced the development of 5k lines of code in just 4 days as part of yaml-custom-tags experiment, showcasing the pace and voluminous productivity of the project.
  • Martian Launches LLM Evaluation Tool: @cute_hamster_07119 introduced a new tool by Martian that evaluates live which LLM inference product to use, based on cost, throughput, and TTFT, for various providers like Anyscale, Together, Lepton, Fireworks, etc., with the tool being launched today.

Links mentioned:


Latent Space ā–· #ai-event-announcements (1 messages):

  • LLM Paper Club Asia Launches: @ivanleomk announced the kick-off of the first LLM Paper Club session in Asia focused on discussing the seminal paper ā€œAttention Is All You Needā€. Interested participants can sign up for future events here and join today’s session via Discord.

  • Stay Updated with Latent.Space Events: Participants were encouraged to stay informed about new Latent.Space events by clicking the RSS logo just above the calendar on the right to add to their calendar with ā€œAdd iCal Subscriptionā€ on hover, ensuring they don’t miss future gatherings.

Links mentioned:


Latent Space ā–· #llm-paper-club (8 messagesšŸ”„):

  • Asia Paper Club Scheduled a New Session: @ivanleomk announced that next week’s Asia paper club will tentatively cover the Self-Rewarding Language Models Paper. They encouraged members to suggest different papers or express interest in presenting.

  • A Call for Feedback: @aimuggle thanked participants for joining the session and invited feedback to improve the still beta-phase program.

  • Question About Self-Reward: @stealthgnome inquired whether self instruct is the input for self-reward in the context of the upcoming discussion.

  • US Paper Club’s Next Feature: In response to @ivanleomk’s query about the US paper club’s next agenda, @eugeneyan shared that they will discuss the Pythia paper with a comprehensive list of contributing authors highlighted.

Links mentioned:

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling: How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16…


DiscoResearch ā–· #mixtral_implementation (2 messages):

  • Mergekit Options for Mixtral Discussed: @philipmay shared a GitHub issue link from mergekit’s author clarifying options for finetuning Mixtral models post-merge, questioning the effectiveness of the ā€œhiddenā€ and ā€œrandomā€ options for later finetuning.
  • Auxiliary Loss in MoE Training Highlighted: @bjoernp responded positively to the shared information, emphasizing that getting the auxiliary loss right is a crucial aspect of MoE training.

Links mentioned:

Mixtral branch: What option should I choose when I want to do some finetuning after the merge? Ā· Issue #116 Ā· cg123/mergekit: The parameter description of ā€œhiddenā€ and ā€œrandomā€ does not exactly explain what to do when I want to finetune later. Is it even useful (possible) to finetune after merging with &q…


DiscoResearch ā–· #general (23 messagesšŸ”„):

  • Quality Data Selection Dilemma: @bjoernp shared an interesting paper suggesting that filtering pretraining data for ā€œqualityā€ might not always enhance model performance. The paper proposes a framework for dataset selection that optimizes model performance rather than adhering to conventional notions of data quality.

  • Exploring KTO Training: @bjoernp and @hammadkhan discussed Kahneman-Tversky Optimisation (KTO), comparing it to Direct Preference Optimization by considering binary signals of desirable vs undesirable completions. They highlighted benefits and implementation aspects, including Hugging Face’s documentation and the suitability of Axolotl for KTO via trl.

  • Contextual AI Approaches with KTO: @hammadkhan introduced Kahneman-Tversky Optimisation (KTO) as a novel method using good and bad examples (e.g., šŸ‘ or šŸ‘Ž) for training. This method, detailed in a blog post, is positioned as a simpler and potentially more effective strategy for updating chat models in production environments.

  • Debate on KTO’s Applicability: @rasdani and @hammadkhan debated the efficacy of KTO in scenarios requiring comparisons between instruction-led good and bad answers. The discussion pivoted around whether labels need to be directly correlated or if KTO’s flexibility allows for broader usage compared to Direct Preference Optimization (DPO).

  • GPT4-Turbo and GPT3.5 Pricing Update: @rasdani highlighted the latest updates including the launch of Embedding V3 models, GPT-4 Turbo, and a significant price reduction for GPT-3.5 Turbo. Details were shared in an official tweet by @OfficialLoganK and the OpenAI blog.

Links mentioned:

  • DsDm: Model-Aware Dataset Selection with Datamodels: When selecting data for training large-scale models, standard practice is to filter for examples that match human notions of data quality. Such filtering yields qualitatively clean datapoints that int…
  • DPO Trainer: no description found
  • Tweet from Logan.GPT (@OfficialLoganK): Great news for @OpenAIDevs, we are launching: - Embedding V3 models (small & large) - Updated GPT-4 Turbo preview - Updated GPT-3.5 Turbo (*next week + with 50% price cut on Input tokens / 25% price …

DiscoResearch ā–· #embedding_dev (12 messagesšŸ”„):

  • German Jina Model Announcement: @sebastian.bodza highlighted the upcoming release of the ā€œjinaai/jina-embeddings-v2-base-deā€ model on Hugging Face, potentially benefiting ranking tasks. No specific release date was mentioned, but it is expected ā€œtomorrow.ā€

  • Question Generation Examples Shared: @sebastian.bodza shared examples of question generation on GitHub, indicating the commencement of work in this area.

  • Mixtral and LLM Usage for Embeddings: In response to @philipmay’s inquiry, @sebastian.bodza mentioned using Mixtral in 4-bit GPTQ with VLLM for their project, focusing on innovative question generation and embedding development.

  • New OpenAI Embeddings Highlighted: Bjoernp brought to attention new embedding models and API updates from OpenAI, which could be advantageous for multilingual support and potentially influence future development strategies.

  • Genie Method for High-Quality Data Generation: Bjoernp shared an arXiv paper proposing Genie, a novel method for generating high-quality, content-grounded data, suggesting its potential utility in improving question-answer pairs or summaries through automated filtering mechanisms for quality assurance.

Links mentioned:


DiscoResearch ā–· #discolm_german (5 messages):

  • Effective Finetuning Achieved: @thomasrenkert reported successfully finetuning DiscoLM German 7B v1 with unsloth. They are excited for the future DiscoLM German-version based on Mixtral-Instruct, highlighting the significant impact of MoE.
  • Unique Dataset for Translation: In response to @hammadkhan’s query, @thomasrenkert shared that they finetuned the model on their own dataset for translating Middle High German to Modern German.
  • Community Support and Interest: @bjoernp showed support and appreciation for @thomasrenkert’s update on their finetuning success and dataset, indicating a positive community interaction and interest in novel applications of DiscoLM.

LLM Perf Enthusiasts AI ā–· #embeddings (2 messages):

  • OpenAI Unveils New Embedding Models and API Enhancements: User @potrock shared a blog post update from OpenAI announcing the launch of a new generation of embedding models, improved GPT-4 Turbo and moderation models, new API usage management tools, along with upcoming lower pricing on GPT-3.5 Turbo. The announcement highlighted two new embedding models, updated GPT-4 Turbo and GPT-3.5 Turbo models, a new text moderation model, and a commitment that data sent to OpenAI API will not be used for training their models.

  • Link Redirection Suggestion: @shacrw suggested that the announcement regarding the new embedding models and API updates should have been posted in a more appropriate channel, providing a link to redirect the conversation. This implies a miss in communication channels but does not provide further context on the discussion.

Links mentioned:

New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.


LLM Perf Enthusiasts AI ā–· #announcements (1 messages):

mat_mto: Thanks Jeff! love all the work you’re doing so far


LLM Perf Enthusiasts AI ā–· #openai (16 messagesšŸ”„):

  • OpenAI Launches New Models and Tools: @potrock shared a blog post from OpenAI announcing new embedding models, updates to GPT-4 Turbo, new moderation models, tools for managing API usage, and upcoming lower pricing for GPT-3.5 Turbo. The updates aim to offer better performance and cost-efficiency for developers.

  • Community Excitement Around Embeddings Update: Following the announcement, @nosa_. expressed intrigue with a ā€œwell well wellā€, and @potrock highlighted the appeal of the shortened embeddings feature, emphasizing its cool factor.

  • Comparison of Embedding Models: @potrock noted that the new large embedding model slightly surpasses the performance of bge-large, indicating subtle but notable improvements in the latest iteration from OpenAI.

  • Advantages of Upgrading to OpenAI’s New Offerings: @res6969 shared plans to upgrade their system to incorporate the newly announced embedding models, citing no longer seeing a need to switch to open-source options considering the simplicity and effectiveness of sticking with OpenAI’s solutions.

  • Exploring Cost-Efficiency of New Embeddings: @shacrw and @michelcarroll discussed the potential cost benefits of using the newer, larger embedding model (v3-large) with dimension shortening, considering its performance and impact on storage and API costs. They pondered the balance between embedding costs and savings on vector database storage, with @michelcarroll leaning towards the v3-large model for better performance and reduced storage costs.

Links mentioned:

New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.


LangChain AI ā–· #general (12 messagesšŸ”„):

  • Welcome Aboard, quarknova!: User @quarknova, an ENS student interning at INRIA, is looking into using LangChain for their project. They are curious about whether the GitHub version will suffice or if the commercial version is necessary.
  • Creating AI Personalities Explored: @jstansbe inquired about creating ā€œAI Personalitiesā€ like an Elon Musk AI without relying on external APIs. @ksolo responded by mentioning the process is known as finetuning and recommended a short course on finetuning large language models for practical guidance.
  • Rapid Web-Search Chatbot Development with LangChain and Streamlit: @johnnucleus shared excitement over quickly creating a chatbot capable of web searches using LangChain and Streamlit, highlighting the efficiency and ease of use.
  • Synthetic Data Generation for ML Training with LLMs: @rajib2189 and @johnny2x2 discussed the application of Large Language Models (LLMs) for generating synthetic data for traditional machine learning training and RAG generation, respectively.
  • PARQUET File Loading Into LangChain: @benjaminbascary sought advice on loading PARQUET files as documents within LangChain, with @johnny2x2 providing a solution for reading with pandas and loading through the DataFrameLoader from the LangChain community document loaders.

Links mentioned:

Finetuning Large Language Models: no description found


LangChain AI ā–· #langserve (3 messages):

  • Exploring LangServe Examples: @veryboldbagel directed users to explore LangServe examples on GitHub, highlighting two agent examples in the readme and a third not listed but available at another link. Those interested can find more information and the examples here and here.

  • Guidance on Constructing Custom Agents: @veryboldbagel provided detailed advice for users looking to define custom tools or create custom agents, mentioning that an off-the-shelf OpenAI tools agent is sufficient for custom tools. For more complex needs, @veryboldbagel recommended using LCEL and LangGraph for a more expressive power, with further instructions available here and here.

  • Issue with Agent and Stream Responses: @hiranga.g encountered an issue when implementing an agent with history where they did not receive a stream response as expected, contrasting with JSON object delivery in the playground. Exploring solutions, they attempted using chain.streamLog() based on suggestions related to a bug with agents and LangServe, but without success details provided on the resolution attempt or outcome.

Links mentioned:


LangChain AI ā–· #share-your-work (2 messages):

  • Querying about Context Awareness: User dejoma inquired if LangChain AI uses the context from the currently visited webpage, indicating curiosity about the AI’s context awareness capabilities.

  • Innovative SQL Use with LangChain for Manufacturing: johnny2x2 shared their experience of implementing LLM automation in a manufacturing context to analyze late customer orders using ERP SQL data with Mixtral 7B v2 5Q 32K. They found success with a methodology that includes creating curated views in the database for the SQL chain to manage large databases efficiently and transitioning to using SQL queries as tools within a specialized task loop for more effective results.


Datasette - LLM (@SimonW) ā–· #llm (3 messages):

  • Major LLM Release Announced: @simonw revealed a forthcoming LLM update aiming to significantly upgrade the underlying openai library. Enthusiasts are encouraged to test the new version, with instructions available on GitHub.

  • Peek into the Future with 0.13 Milestone: For those curious about what the update entails, @simonw shared a link to the 0.13 Milestone on GitHub, providing insights into upcoming enhancements.

  • Call for Help on Readline Issues: @simonw is seeking assistance for resolving a bug related to readline problems in LLM chat, where arrow keys output ANSI codes instead of navigating text. Contributors can find more details on this issue at GitHub.

Links mentioned: