OpenAI released a new GPT4 Turbo version yesterday (our notes here). Weāre using this opportunity to conduct a natural experiment for summarization. This version is generated with the ānewā GPT4T from Jan 2024, see previous email with the Nov 2023 Jan version for comparison.
Table of Contents
[TOC]
PART 1: High level Discord summaries
TheBloke Discord Summary
-
UnSloth Strides Towards Multi-GPU Support: UnSloth is gearing up to introduce limited multi-GPU support, specifically aiming at Google Colab beginners. This move piqued the interest of AI research newcomers due to its promised simplicity.
-
AI Models Leap onto Nintendo Switch: The community buzzed with excitement as
@kalomaze
showcased AI models like Tiny Llama and Mistral running on a Nintendo Switch, sparking conversations about deploying lightweight models on unconventional hardware. -
Model Configurations Dive Deep: Detailed examination of models such as
exllama
,bagel
, anddolphin
, especially focusing onrope_theta
andsliding_window
usage. Configurations from Hugging Face were specifically highlighted, providing valuable insights for model optimization. -
Chatbots Gain Character and Style: Techniques for crafting chatbot personas, including using datasets generated by ChatGPT and the importance of example quantity in training for style transfer, were actively discussed. The dialogue underscored creating chatbots with distinct personalities like Samantha and exploring cost-effective training alternatives like local LLMs.
-
Innovative Model Merging Techniques Explored: The discussion ventured into the territory of optimal weight maintenance for model performance and the intrigue of merging overfit models with techniques like DARE or SLERP. The conversation suggested a complex but promising frontier for selectively combining models to harness multiple strengths.
OpenAI Discord Summary
-
GPT-4 Speeds: A Tortoise and Hare Story: Users noted GPT-4-1106-preview has significantly longer processing times for document analysis through the API, running into 3-7 minute delays. Discussions pointed to potential fixes including upgrading service plans for dedicated capacity, though GPT-4 Turbo was clarified not to inherently support direct file analysis.
-
GPT Model Troubleshooting Takes a Village: An error flagged as āunusual activityā by GPT-4 led to speculation that VPN use, prompt nature, or account sharing could be culprits, while Dall-E faced criticism for frequent misspellings in image creations, suggesting shorter text inputs might mitigate this issue.
-
Code Blocks Get a Glow-Up with GPT-4: The āAlways expand code outputā setting was discussed for enhancing code readability in GPT-4, with additional dialogues around importing Python modules into the AI system despite security concerns and the need for a new conversation to reflect CustomGPT edits.
-
Text Transcription Tackles GPTās Goliath: Challenges with large text transcription, specifically sermon passages up to 50KB, were central, with GPT-3.5 struggling due to input size limits. Suggestions leaned towards utilizing GPT-4 Turbo for its larger context window capabilities and the exploration of NLP tools for more efficient paragraph chunking and correction of grammatical issues.
-
Cost Considerations Clash with Capability Needs: The dialogues underscore a constant balance between desire for advanced functionalities and cost implications, especially significant in the transition from GPT-3.5 to GPT-4 Turbo for tasks requiring intensive text processing and analysis.
Nous Research AI Discord Summary
-
Context Capabilities Extended, Mistral Mines Deeper: Innovations in extending model context capabilities dominated discussions, with methods such as fine-tuning highlighted as viable for enhancing models. Particularly, LLaMA-2-7B-Chat stunned with an extension to 16,384 context window, and SelfExtend was recommended as a fine-tune-free option.
-
Pondering Technologyās Social Dichotomy: The conversation occasionally veered off-technical paths, touching on broader impacts of technological advancement, with opinions asserting that it may further polarize society. Humorous diversions included a swimming cat GIF, while concerns were raised about Twitterās content delivery mechanisms potentially throttling AI contributions.
-
Benches Set for Everyone Coder 33B: The spotlight shone on Everyone Coder 33B Base - GGUF, quantized using Massed Compute for the GGUF format. Speculations also surrounded the performance comparisons involving Hermes Mixtral, albeit without detailed backing data.
-
Embedding Models to Content-genie in a Bottle: OpenAIās fresh embedding models and API enhancements were shared, flaunting lower GPT-3.5 Turbo pricing and new tools. Simultaneously, the Genie method for high-quality data generation in content-grounded tasks captured attention, suggesting significant advances in LFQA, summarization, and extraction capabilities.
-
Tech Titans and Ingenious Interventions: Spirited discussions included GPT-2 inference experiments on WebGL, challenges in cloning LLaMA with a Mixtral model, and debating model efficiency with an eye on phi2 optimizations. GPU harnessing for ML stood out as a creative pursuit, reimagining gaming hardware for scientific exploits.
OpenAccess AI Collective (axolotl) Discord Summary
-
Fine-Tuning with a Norwegian Twist:
@henriklied
is fine-tuning Mistral on a dataset of 100k articles for Norwegian title generation, with a recommendation to cap epochs at 4 to prevent overfitting. Meanwhile, shareGPTās handling of extended conversations shows limitations by design, sparking a preference debate between ChatML and BlockML formats for conversation management. -
Merging Models, Qlora on the Rise: Training QLoRA models without relying on Bitsandbytes is explored with alternatives like fp16 and AutoAWQ for quantization highlighted. A link to a GitHub guide for merging trained QLoRA models into base models was shared, alongside a quantization advocacy tweet by Tim Dettmers.
-
Dataset Developments Stir Excitement: A new dataset on Hugging Face aimed at training the Snorkel model and a Mistral 7 fine-tuning announcement indicate significant advancements. ALPCAās 34 percent figure improvement over old GPT-4 metrics hints at considerable performance enhancements.
-
DPO Discussions Dive Deep: Queries around the DPO Training Plots and dataset compounding issues imply a community need for clearer documentation and troubleshooting guides in the realm of DPO Training and dataset integrity.
-
Showcase Shoutout: A YouTube video by
pradeep1148
in the community-showcase hints at community engagement and project sharing, although specifics were not outlined.
LM Studio Discord Summary
-
Proxy Solutions and TTS Enable Seamless LM Studio Use: Users experiencing limitations with LM Studio due to regional blocks on HuggingFace have found workarounds through proxy settings and a Text-to-Speech interface, enhancing accessibility and interaction with models.
-
System Updates Fix Model Compatibility Issues: Update your C++ redistributables to resolve model loading errors in LM Studio, as advised by
@heyitsyorkie
, following difficulties with models like Stable Code, Deepseek, and Codellama. -
Model Operation Queries Span Versatility and Performance: From running multiple LM Studio instances for parallel modeling to examining the best GPU options for AI work, discussions reveal a keen interest in enhancing model performance and efficiency. An RTX 3090 and M2 Mac Studio are among the recommended hardware for handling large language models.
-
Bug Reports Propel Improvements in MoE Models: Users reported bugs affecting MoE model settings in LM Studio, notably when transitioning between 4X and 2X MoE models, sparking an immediate investigation by the development team to enhance user experience and model functionality.
-
Model Exploration and Integration Challenges Highlighted: Experimentation with large models like mixtral8x7B using an RTX 3090, and frustrations with outdated APIs, underscore community efforts to push the boundaries of AI model utility and application in projects, despite facing integration and fine-tuning hurdles.
Mistral Discord Summary
-
GPU Rentals and Free Resources for AI Work: @mrdragonfox highlights runpod, vast, lambda, and Kaggle as key platforms offering GPU rentals by the hour and free resources, respectively, to facilitate AI summarization and development efforts. Kaggle provides 2 T4 GPUs for 30 hours per week, a boon for developers seeking computational power without heavy investment.
-
Evaluating LLMs Beyond Traditional Metrics: @adrienbufort criticizes traditional translation metrics like BLEU and Rouge for being inadequate in evaluating large language models (LLMs), advocating for an ELO-like evaluation system alongside MMLU and Alpaca Eval as superior methodologies. These offer closer alignments to human preferences and intra-LLM evaluation capabilities, respectively.
-
Innovations in AI Browser Interactions: @sublimatorniq introduces a groundbreaking tool enabling AI to reference DOM node references, enhancing web content interaction by making browser queries more contextually aware. This tool is designed for compatibility with MistralAI, indicating a significant leap in AI-mediated web navigation.
-
Quantization and Optimization Strategies for AI Models: Discussions in the community have unveiled serious considerations around memory requirements and efficiency for Mistralās 4bit inference indicating a 26GB memory footprint as essential. @mrdragonfox recommends exllamav2 for its superior memory efficiency over traditional 4-bit transformers, suggesting it as an effective strategy for optimizing AI model performance.
-
API Limitations, Bugs, and Hosting Insights Disclosed: The community has encountered various issues ranging from early stopping dilemmas to a bug with the āmax_tokensā parameter causing a 500 error when set to 1, with a specific GitHub issue outlined here. Additionally, pertinent inquiries about Mistral APIās hosting location revealed its placement in Europe on Azure in Sweden, proving crucial for developers considering data locality and compliance.
Eleuther Discord Summary
-
AutoGluon Steps In for Metaās SAM: After failing to find a fine-tuning codebase for Metaās SAM model,
@the_alt_man
turned to AutoGluon due to its support for Lightning and compatibility with GPUs, though not with TPUs. -
Innovating Beyond Infiniband for Multi-node Training: Sparked by
@elyxlz
ās quest for multi-node training sans Infiniband, a discussion evolved around a strategy involving periodic merging during training steps, referencing the DiLoCo paper on distributed optimization. -
Byte-Level Transformers and Proxy-Tuning Dive Deep: Discussions ranged from the efficiency of byte-level transformers like ByT5 and unfair sequence length comparisons in MambaByte paper, to sharing a new lightweight proxy-tuning method for LLMs as detailed in a recent publication. This encapsulates the growing interests in optimizing and understanding LLMsā mechanisms and efficiency.
-
Chess AI Evolution Imagined: A blend of speculation and current state-of-the-art review highlighted discussions on chess AI, including Stockfishās dominance and theoretical reflections on chess AI advancements by 2024 as envisaged in a chess.com blog.
-
GPT-NeoX, PyTorch, and CUDA Wrestle with Testing:
@catboy_slim_
raised concerns regarding testing challenges after updating Python, PyTorch, and CUDA versions, creating an issue on GitHub about pytest failures. Meanwhile, QLoRAās tuning confusion within neoX 20b was clarified, marking the importance of directing queries to the appropriate channels for library-specific issues.
Perplexity AI Discord Summary
-
Perplexity Pro Unleashes Powerful Features: Users discussed the benefits of Perplexity Pro over the regular version, highlighting features such as unlimited Copilot queries and the ability to attach images and files with models like Claude 2.1 and GPT-4. Details were supported with a link to the Perplexity Pro FAQ.
-
Privacy Safeguards Queried Amidst Data Retention Concerns: Discussion among users like
@emisaurus_hex
and@firesonwires
arose regarding Perplexityās privacy policy on retention of search history and personal information until account deletion, prompting an expert to clarify that deleted threads are gone from servers after 30 days. -
Debate Over Data Storage Ethics: Users engaged in a philosophical discussion about the implications of aggregated search data for privacy, with varying opinions on the value and potential misuse of such data.
-
Community Tackles Technical Support Together: The community, including Perplexity representatives, came together to assist users with technical issues, ranging from recovering bookmarked threads to inquiries about file upload limits.
-
Exploring Perplexityās Educational and Professional Applications: In the #sharing channel, users shared experiences using Perplexity in diverse fields such as content creation, Smartsheet learning, and astronomy education, with one user emphasizing Perplexityās superiority over Google Search and OpenAI for content generation highlighted in a YouTube video.
-
Technical Discrepancies Between Perplexityās Website and API: The #pplx-api channel included discussions on the performance differences between the website and API versions of a tool, with a user seeking clarifications on labs versus API parameters and another user reporting an unresolved double charge issue with [email protected].
HuggingFace Discord Summary
-
HuggingFaceās Community Expands and Innovates: The community is experiencing growth with more creators joining, highlighted by the launch of the Lightweight Open LLM Leaderboard and discussions around AI interpretability in a survey by H. Luo and L. Specia. The innovative AI Alignment model, TenyxChat-8x7B-v1, was also introduced, showcasing preference tuning with a high MT-Bench score. Practical utility projects like CheXRay and ZeroGPU were spotlighted, indicating the diverse explorations within HuggingFaceās community.
-
Real-World AI Discussions Focus on Practicalities: Community members shared insights from selecting models for RTX 3080 GPUs, to feature extraction techniques and the realities of model pretraining. The conversation also touched on the essentials of data management for training and evaluation, alongside the benefits of leveraging community blog posts for wider outreach on projects, such as the upcoming wav2vec2-bert model release.
-
Dataset Evaluation Frameworks and Data Quality: The community raised the need for comprehensive frameworks to evaluate datasets used in training LLMs, suggesting an area ripe for development given the current focus on model rather than dataset evaluation.
-
Flutter and OpenAI Innovations Lead Cool Finds: Fascination with HuggingFaceās dataset documentation analysis and a detailed study titled āBinocularsā for detecting LLM-generated text were among captivating discoveries. The integration efforts of ONNX runtime with Flutter and a specific Flutter SDK for HuggingFace Inference APIs underscore the communityās efforts to bridge AI with application development.
-
WhisperSpeech and AI Development Highlights: The WhisperSpeech space signals a leap in multilingual TTS, while an error in the CheXRay demo pointed to challenges in deploying AI in medical imaging. Thereās anticipation for a Nemo Model blog post and a call for broader engagement through community-contributed content.
-
Artificial Intelligence and Machine Learning Breakthroughs: Googleās Lumiere showcased an advanced approach in T2V models, while a Medium article on benchmarking received praise. In NLP, the TorToiSe TTS from Coqui demonstrated both quality and potential speed improvements.
-
Technical Insights from #diffusion-discussions and #computer-vision: Questions regarding loading LoRA models locally and the inquiries about computational vision techniques emphasized the technical depth of discussions in the HuggingFace community.
-
Gradio 4.16 Launch Boosts Developer Experience: The latest Gradio release champions features aimed at streamlining ML app development, like native support for Polars Dataframe, enhancing interactivity and performance in ML deployment.
Each bullet encapsulates thematic discussions and announcements spanning various HuggingFace channels, reflecting the guildās vibrant engagement with cutting-edge AI technologies and community-driven initiatives.
LAION Discord Summary
-
LAION2b-en Aesthetic Scores Go Dark: The LAION2b-en aesthetics scores dataset sought by
@ppwwyyxx
is currently unavailable on Hugging Face, as it was disabled by the authorās request, with the community encouraged to stay tuned for updates. -
A Leap Toward Open Source Voice Chat Innovations: A new voice chat interface demo featuring Whisper and WhisperSpeech along with an open-source LLM (Dolphin 2.6 Phi-2) was shared by
@jpcl_
, promising lower latency and more natural conversation flow. Interested parties are invited to collaborate to further enhance the project, details of which are available on Hacker News. -
VAās AI Tech Sprint Calls for Competitors: The VAās AI Tech Sprint focuses on Ambient Dictation for Clinical Encounter Notes, with a $300K first prize, eyeing advancements in AI-driven healthcare. U.S. residents are encouraged to participate, with more information available on Challenge.Gov.
-
Byte-Level Transformers Show Promising Future: Byte-level transformers may herald a new frontier in AI capabilities, supported by optimistically cautious insights from
@marianbasti
after a review of a pertinent research paper. -
Innovative Developments in AI Image Generation: Breakthrough projects like RPG-DiffusionMaster and InstantID set new standards in text-to-image diffusion and ID-preserving generation, illustrating the rapid pace of advancement in this area. Additionally, a discussion highlighted the non-diminishing potential for growth in AI image generation, referencing the continual emergence of new papers and tools in the field.
LlamaIndex Discord Summary
-
LLMCompiler Webinar Promises Parallel Function Calling Revolution: A last-minute webinar featuring authors Sehoon Kim and Amir Gholami, discussing the LLMCompiler, aims to highlight its superiority over sequential frameworks like ReAct, focusing on long-term planning and parallelization capabilities. Relevant resources include the LLMCompiler paper, LlamaPack, and a GitHub notebook.
-
LlamaIndexās OSS and Partnerships Expand Its Ecosystem: LlamaIndex launches an Open Source Software (OSS) repository with a guide on building a Slack bot, partners with Zilliz Universe to enhance its platform with a scalable retrieval service, and announces Day 0 support for OpenAIās latest embedding models. Additional offerings include effective prompting and a new TypeScript version (LlamaIndex.TS version 0.1.0), offering ease of use and broad support capabilities.
-
Dynamic Discussions Shape LlamaIndexās Community: Community members probe the availability of TextGenerationInference LLM for LlamaIndex, explore customizing the Chat Engine with
similarity_top_k
parameters, address challenges in implementing insurance domain queries, and express enthusiasm for new embedding models and API updates from OpenAI. Custom prompts guidance for extending contexts in queries signifies communityās active engagement in refining LlamaIndexās usability. -
Cutting-Edge AI Discussions Flourish Amongst Peers: Zep emerges as a noteworthy tool for enhancing chatbots with production-grade capabilities and entity extraction, while LlamaIndex is positioned as a flexible data orchestration tool comparable to Amazon Kendra. A Substack article showcases a self-learning RAG that utilizes recursive retrieval, automated knowledge graph creation, and memory/multi-hop reasoning, pushing boundaries in AI applications.
-
Continuous Integration and Learning in AI Applications: The sharing of resources, queries, and solutions within the LlamaIndex guild underscores a vibrant community dedicated to ongoing learning, integration of cutting-edge tools, and exploration of advanced AI techniques. These discussions reveal a concerted effort to harness the full potential of AI technologies like LlamaIndex and Zep, in various domains including bots, data retrieval, and self-learning knowledge graphs.
Latent Space Discord Summary
- MORPHEUS-1 Dreams Big: Prophetic AI announces MORPHEUS-1, the first multi-modal generative ultrasonic transformer for lucid dream induction, targeting a beta release in Spring 2024. The reveal tweet promises groundbreaking capabilities.
- Rapid YAML Custom Tags Experimentation: go-go-golems showed off an ambitious project milestone, achieving 5k lines of code in just 4 days, as part of an experiment with yaml-custom-tags. Their progress can be tracked on GitHub.
- Martianās LLM Evaluation Tool Launches: Martian introduced a tool to dynamically evaluate which LLM inference product to use based on cost, throughput, and time to first byte (TTFT), across providers like Anyscale and Fireworks. The launch was detailed here, with accompanying evaluation methodology.
- LLM Paper Club Asia Edition Kickoff: The LLM Paper Club expands to Asia with a session on the seminal paper āAttention Is All You Needā. Interested persons are encouraged to sign up and engage via a specific Discord link.
- Pythia Paper Next on US Clubās Agenda: The US paper clubās next discussion will focus on the Pythia paper, which explores metrics for analyzing Large Language Models (LLMs) throughout training and scaling. The paper and authors are detailed in this link.
DiscoResearch Discord Summary
-
Mixtral Finetuning Gets a Deep Dive: In mergekitās GitHub issue,
@philipmay
initiated a discussion on the finetuning potential of Mixtral models, exploring the āhiddenā and ārandomā options. The conversation underscored the importance of auxiliary loss in MoE training as crucial for effectiveness. -
Rethinking Data Quality for AI Models: A new study highlighted in the community argues against traditional data quality filtering methods, suggesting a model-aware approach for data selection could lead to better performance.
-
Kahneman-Tversky Optimisation (KTO) Explored: The community discussed KTO, comparing it with Direct Preference Optimization. This method, simplified via binary good and bad training examples, was evaluated for its implementation viability in contextual AI and production model updates, with resources like Hugging Faceās documentation serving as guides for potential adoption.
-
OpenAIās Strategic Model and Pricing Updates : Significant updates were shared regarding OpenAIās launch of GPT-4 Turbo and a price reduction for GPT-3.5 Turbo as detailed in an official tweet. These changes signal strategic shifts in embedding options and pricing, indicating an evolving landscape for AI model accessibility.
-
Embedding Innovations and German AI Model Enhancements: The community discussed the upcoming release of a new German Jina model for ranking tasks and shared insights into using Mixtral for advanced question generation and embedding development. A highlight was the new embedding models and API updates from OpenAI, hinting at expanded capabilities in multilingual support. Further, the Genie method, suggested by a shared paper, promises a novel approach for generating high-quality data, potentially advancing the creation of more effective question-answer pairs or summaries.
LLM Perf Enthusiasts AI Discord Summary
-
OpenAI Revamps with Fresh Embedding Models and API Tools: OpenAI has launched a new wave of embedding models, updated GPT-4 Turbo, and moderation tools, embarking on improved performance and cost efficiency for developers. This suite includes two new embedding models, enhancements to GPT-4 Turbo, a novel text moderation model, and forthcoming lower prices for GPT-3.5 Turbo, with a clear stance on not using customer data for model training.
-
Community Buzzes Over New Embedding Capabilities: The introduction of new embedding models and updates to GPT-4 Turbo stirred discussions among community members, with specific excitement around the abbreviated embeddings feature. Additionally, comparison of the new large embedding model against bge-large revealed slight performance supremacy, underscoring OpenAIās advancements.
-
Upgrade Conversations Sparked by OpenAI Updates: With OpenAIās latest offerings, community members are considering system upgrades to leverage the new embedding models. The debate includes potential cost savings from the newer, larger embedding model due to dimension shortening, comparing embedding expenses against savings on vector database storage.
-
Efficiency and Cost Implications Discussed: Conversations delved into the efficiency and potential cost benefits of OpenAIās newest embedding model, weighing the performance against storage and API costs. The dialogue highlighted a leaning towards the v3-large model for its improved performance and the promise of reduced storage costs, mirroring a broader interest in maximizing both operational efficiency and cost-effectiveness.
-
Misplacement of Announcement Draws Minor Attention: A notable mention was a suggestion to redirect the discussion on the new embedding models and API updates to a more appropriate channel, presenting a minor hiccup in communication flow within the community. This pointed to a procedural note on channel relevance and highlights the importance of targeting the right audience within communal platforms.
LangChain AI Discord Summary
-
LangChain Powers Up Projects and Personalities:
@quarknova
, an ENS student, is exploring LangChain for a project, raising questions about GitHub versus commercial versions. Meanwhile, creating AI personalities like āan Elon Musk AIā involves finetuning, with a short course recommended for practical insights. -
Rapid Chatbot and Data Generation Developments: Users celebrated creating a web-search chatbot using LangChain and Streamlit for its speed and ease, while also discussing the role of LLMs in generating synthetic data for machine learning training and RAG generation. For PARQUET file enthusiasts, solutions involve pandas and
DataFrameLoader
for integration into LangChain. -
Diving Deeper with LangServe: LangServe enthusiasts are directed to examples on GitHub, including tips for constructing custom tools or agents with an emphasis on LCEL and LangGraph for added expressiveness, available here and here. Also, troubleshooting stream response issues in agent implementations highlighted ongoing discussions.
-
Exploring Context Awareness and SQL Integration: Curiosity about LangChain AIās use of webpage context was raised by
dejoma
, indicating interest in the AIās context awareness capabilities.johnny2x2
shared a breakthrough using LangChain for LLM automation in manufacturing to analyze late customer orders leveraging ERP SQL data, with the implementation of Mixtral 7B v2 5Q 32K for effective database management and specialized task loops.
Datasette - LLM (@SimonW) Discord Summary
- Major LLM Upgrade on the Horizon: @simonw announced an upcoming LLM update that promises substantial improvements to the openai library, with comprehensive testing instructions listed on GitHub.
- Sneak Peek into LLMās Future Developments: For a detailed view of what lies ahead, @simonw shared insights through the 0.13 Milestone on GitHub, highlighting the planned enhancements for the command-line access to large language models.
- Community Call to Tackle Readline Bug: @simonw is calling for community support to fix a readline bug in LLM chat that causes arrow keys to produce ANSI codes rather than navigating text, with more information available on GitHub.
Skunkworks AI Discord Summary
- Curiosity Stirs in Skunkworks: Arielnlee inquires if thereās ongoing work on bakklava-2, sparking discussions on the future developments in the bakklava-1 channel.
- Mystery Link Dropped: Pradeep1148 shares an enigmatic YouTube video in the #off-topic channel, content and context unspecified.
PART 2: Detailed by-Channel summaries and links
TheBloke ā· #general (1212 messagesš„š„š„):
-
UnSloth Heading Towards Multi-GPU Support: UnSlothās development is actively discussed, with a hint that limited multi-GPU support may be introduced months later. The OSS version is notably targeted at Google Colab beginners, sparking discussions on its ease of use for newcomers in AI research.
-
AI-Developed Discord Bots and Fine-Tuning Chatter: Members, including
@flail_.
, discuss their adventures into Discord bot making with Python, acknowledging the convenience of Python libraries despite personal preferences for other languages. Thereās also a light-hearted debate on the professional credentials one could claim from fine-tuning language models. -
Academicatās Use in Research and Ethical Funny Side:
@kaltcit
shares insights on using mouse fingers for PCR in biomedical research, humorously noting the ability to crack jokes post-amputation. This opens an ethical and scientific discussion on research methodologies and the use of animals. -
Concerns About the Open LLM Leaderboard: Discussions reveal skepticism towards the Open LLM Leaderboard, particularly regarding the quality of models it promotes.
@fiddlenator
criticizes the leaderboard for endorsing ātrash merge models,ā sparking a conversation on the validity and evaluation of such models. -
Tiny Llamaās Nintendo Switch Execution Sparks Interest: Members express astonishment and curiosity as
@kalomaze
shares links to videos purportedly showing AI models like Tiny Llama and Mistral running on a Nintendo Switch. This unique execution leads to discussions on the potential for lightweight model deployment on unconventional hardware.
Links mentioned:
- no title found: no description found
- DeepSeek-Coder: When the Large Language Model Meets Programming ā The Rise of Code Intelligence: The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and devā¦
- import sysimport osfrom tqdm import tqdmsys.path.append(os.path.dirname(os - Pastebin.com: Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
- Tails - Home: no description found
- How to Install NVIDIA Drivers on Rocky Linux 9 or 8 - LinuxCapable: Learn to install NVIDIA Drivers on Rocky Linux 9 or 8 using the command line terminal and Nvidia Cuda REPO for the latest version.
- God helmet - Wikipedia: no description found
- LoneStriker/OpenHermes-2.5-Mistral-7B-4.0bpw-h6-exl2 Ā· Hugging Face: no description found
- Marvelous Dc Gotham GIF - Marvelous Dc Gotham Gotham Tv - Discover & Share GIFs: Click to view the GIF
- How I Won Singaporeās GPT-4 Prompt Engineering Competition: A deep dive into the strategies I learned for harnessing the power of Large Language Models (LLMs)
- GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.: A multimodal, function calling powered LLM webui. - GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.
- GitHub - facebookresearch/audio2photoreal: Code and dataset for photorealistic Codec Avatars driven from audio: Code and dataset for photorealistic Codec Avatars driven from audio - GitHub - facebookresearch/audio2photoreal: Code and dataset for photorealistic Codec Avatars driven from audio
- GitHub - facebookresearch/Qinco: Residual Quantization with Implicit Neural Codebooks: Residual Quantization with Implicit Neural Codebooks - GitHub - facebookresearch/Qinco: Residual Quantization with Implicit Neural Codebooks
- Stanford Hypnosis Integrated with Functional Connectivity-targeted Transcranial Stimulation (SHIFT): a preregistered randomized controlled trial - Nature Mental Health: Investigators present findings from a double-blind randomized controlled trial of personalized stimulation of the left dorsolateral prefrontal cortex using transcranial magnetic stimulation to increasā¦
TheBloke ā· #characters-roleplay-stories (74 messagesš„š„):
-
Technical Inquiries on Model Configurations: Users in the channel discussed the configurations of various models like
exllama
,bagel
, anddolphin
, focusing on aspects likerope_theta
and whether to use asliding_window
. For example,@dreamgen
pointed out differences in configuration files for models, highlighting discrepancies inrope_theta
values and the use ofsliding_window
(Hugging Face Bagel Config, Hugging Face Dolphin Config). -
Discussions on Role-Playing Models: Various users shared and sought advice on the best role-playing models with 7 billion parameters, with
@kalomaze
recommending models likeKunoichi DPO v2
andFett-uccine
(Fett-uccine on Hugging Face) while also discussing suitable quantization options for different VRAM capacities. -
Concerns About Sensitive Content Models: The conversation briefly touched upon the use of models for generating content that could be considered sensitive, with a general consensus towards responsible use and the application of existing models through careful prompting, highlighted by
@c.gato
and@superking__
. -
Clarifications and Corrections: Users like
@jondurbin
provided clarifications regarding their use of model configurations inherited from base models, adding context to the discussion around customizing models for specific purposes (Mistral base model config). -
Token Streaming and Text-to-Speech (TTS) Integration Efforts:
@mrdragonfox
and@stoop poops
shared their progress and challenges in integrating token streaming and TTS capabilities into bots, aiming for real-time implementation and exploring efficient APIs and engines likerealtimeTTS
withcoqui
engine.
Links mentioned:
- Kronk Its All Coming Together GIF - Kronk Its All Coming Together - Discover & Share GIFs: Click to view the GIF
- config.json Ā· mistralai/Mistral-7B-v0.1 at main: no description found
- config.json Ā· cognitivecomputations/dolphin-2.6-mistral-7b at main: no description found
- Epiculous/Fett-uccine-7B-GGUF at main: no description found
- Release Quadratic Sampling Test Build (koboldcpp) Ā· kalomaze/koboldcpp: Replacement for the last idea (Smooth Sampling) with a different scaling mechanism. The idea behind it is to simplify sampling as much as possible and remove as many extra variables as is reasonabā¦
- config.json Ā· jondurbin/bagel-dpo-7b-v0.1 at main: no description found
- config.json Ā· mistralai/Mistral-7B-Instruct-v0.2 at main: no description found
TheBloke ā· #training-and-fine-tuning (20 messagesš„):
-
Crafting a Character for Chatbots:
@superking__
suggested writing examples of how a character would respond to different user inputs and training the model on this dataset. They further mentioned the possibility of asking ChatGPT to help generate this dataset. -
Quantity Matters in Training Chatbots: In a discussion about the number of examples needed for training a chatbot,
@amogus2432
mentioned that between 10 and 100 examples should suffice for style transfer in training a qlora. However,@dirtytigerx
countered that for Samantha-level performance, one would need around 6k multiturn conversation samples, suggesting that with only 100 samples, one might end up overfitting. -
Personal Touch in Chatbot Conversations:
@lordofthegoons
expressed a desire to create a chatbot with a specific persona and consistent conversation style, akin to the Samantha model. They voiced concerns about generating a synthetic dataset that achieves this without seeming repetitive or limited. -
Exploring Cheaper Alternatives for Training Chatbots: Discussing the potential cost of using GPT-4 API for generating a dataset,
@dirtytigerx
recommended exploring local Large Language Models (LLMs) as a more economical option. They mentioned using platforms like runpod for experimenting could be cheaper and more efficient than relying solely on services like ChatGPT, which may encounter rate limits. -
Venturing into Financial Advisor Chatbots:
@VR
inquired about creating a personal financial investment advisor chatbot, contemplating the use of prompt tuning and fine-tuning on financial documents and datasets. They sought advice on how to effectively deploy a model within the constraints of a 24GB GPU and leverage the latest stock prices, summaries, trends, and expert analyses.
TheBloke ā· #model-merging (19 messagesš„):
- Optimal Weight Hypothesis Discussed:
@sanjiwatsuki
shared a hypothesis that maintaining a weight slightly over 1.0 might be optimal for model performance due to the TIES resolution process causing some of the effective weight to drop. The idea is to compensate for this anticipated drop. - Uncertainty Over Negative Weights: When
@kquant
asked if negative numbers break the script,@sanjiwatsuki
expressed uncertainty but suggested that the code might handle it without issues based on their rough memory. - Exploration of Censored Model Assimilation:
@kquant
mentioned a curiosity about experimenting with super censored models and whether itās possible to assimilate models selectively, taking mostly desired features. - Selective Merging with DARE and SLERP:
@kquant
proposed using techniques like DARE or SLERP for merging models with high performance in different areas, such as ARC and MMLU, to optimize for multiple strengths without simply averaging scores. - Intrigue Over Successful Overfit Model Merging:
@kquant
shared surprise at two overfit models maintaining their test positions after being merged through SLERP, questioning the expected negative impact of overfitting on such operations.
TheBloke ā· #coding (1 messages):
- LangChain Fine-Tuning Query by Newbie:
@nandavikas
inquired about fine-tuning Llama2 using LangChain for extracting specific information from PDFs as a dictionary. They shared their experience with PyTorch and sought guidance on a similar process or useful documentation within LangChainās ecosystem.
OpenAI ā· #ai-discussions (35 messagesš„):
-
GPT-4 Speed Discrepancy Concerns:
@romansh2302
expressed frustration with the long processing times of GPT-4-1106-preview for document analysis through the API, experiencing delays of 3-7 minutes compared to faster speeds on the web version.@rendo1
and@lugui
explained that performance variances could be due to API usage peaks or server resource allocation, suggesting that there isnāt a straightforward fix beyond possibly upgrading service plans for dedicated capacity. -
Queries on GPT-4 Turbo Stability: When asked if GPT-4 turbo stable version will address the speed issues,
@elektronisade
clarified that GPT-4 Turbo doesnāt inherently support direct file analysis, indicating a possible misunderstanding of feature availability. -
Alternatives and Cost Inquiries for Enhanced API Service:
@romansh2302
considered using GPT-3.5 Turbo as an alternative for document analysis but found it to be less effective. Dialogues with@lugui
led to the revelation that dedicated API capacity might be available but at undefined costs tailored to individual use cases, primarily targeting corporate clients. -
Unusual Activity and Error Troubleshooting:
@hellomyfriend1576
encountered an error message about āunusual activityā from their system when using GPT-4, with attempts to resolve through clearing cookies and network changes.@lugui
and@og_tort
speculated it could be related to VPN use, the nature of prompts, or account sharing practices. -
Discussion on Dall-E Misinterpretations:
@alwayspercipient
noted Dall-Eās frequent misspellings in image creations, with@muyfashionista
suggesting that this issue might be less pronounced with shorter text inputs and providing links to community discussions and tips for minimizing grammatical errors in generated images.
Links mentioned:
TuringsSolutions/PFAF750 Ā· Datasets at Hugging Face: no description found
OpenAI ā· #gpt-4-discussions (105 messagesš„š„):
-
Clarification on āAlways Expand Code Outputā Feature:
@angry_coder
inquired about the āAlways expand code outputā setting, leading to@darthgustav.
clarifying that it ensures code blocks are wrapped for easier reading. After testing,@darthgustav.
confirmed it pertains to wrapped code blocks, enhancing readability. -
Exploring Python Code Importation: Users
@bambooshoots
and@darthgustav.
discussed the possibility of importing Python modules by uploading a zip file and adjusting the system path. Despite initial security concerns, they speculated about its potential utility and extended functionality within OpenAI restrictions. -
CustomGPT Edits Require New Conversations:
@elegante94
asked if edits to a CustomGPT would reflect in active sessions, to which@darthgustav.
responded that a new conversation is necessary to experience changes. This was a point of confusion for@elegante94
who had been making iterative edits under this misconception. -
Optimizing Prompt Language for GPT:
@elegante94
sought advice on whether attaching images or using complex wording would improve GPT outputs for creative imagery.@darthgustav.
advised that GPT cannot interpret images in prompts and recommended precise language for better outputs, while also acknowledging the potential exploration of themes like multi-agent interactions via Microsoft Autogen and CrewAI. -
Challenges with GPT Bot Updates and Mimicry Restrictions:
@rodney.leonardo
experienced difficulties saving updates to a GPT bot designed to serve as a product design assistant.@darthgustav.
suggested troubleshooting steps and noted restrictions against mimicking living individuals, which led to a realization that naming specific living persons like āJony Iveā may block the bot from being saved.
OpenAI ā· #prompt-engineering (558 messagesš„š„š„):
-
Exploring Paragraph Chunking and NLP for Sermon Transcription:
@doublefelix
is working on a project involving the transcription of sermons, seeking to split large blocks of text (up to 50KB files) into manageable paragraphs and correct grammar issues. They explore the use of GPT-3.5 and NLP techniques as potential solutions, facing challenges with input size limits and finding adequate strategies for effective paragraph chunking. -
The Path Towards an Efficient Workflow: After multiple trials with GPT-3.5, including varied prompt strategies and attempts at leveraging API functionalities,
@doublefelix
encounters obstacles in ensuring that the AI properly acknowledges and processes the entirety of the input without hallucinations or significant errors. -
Considering GPT-4ās Capabilities for Enhanced Context Management:
@darthgustav
suggests utilizing GPT-4 Turbo, which can handle larger context windows and leverage Python tools for semantic analysis and paragraphing, thereby potentially bypassing the limitations encountered with GPT-3.5. This approach may offer a more efficient workflow for processing and splitting the transcription files. -
Reflecting on Project Costs and Constraints: Concerns arise regarding the increased costs associated with moving to GPT-4 Turbo for processing the transcriptions.
@doublefelix
expresses a desire to find a balance between functionality and affordability, considering both the manual labor involved and the financial implications of using a more advanced AI model. -
Exploration of Alternative NLP Tools and Final Thoughts: As
@doublefelix
weighs their options, including a potential fallback plan using GPT-4 Turbo, they also plan to explore NLP packages such assemantic-text-splitter
for a possible solution to their text processing needs. Gratitude is expressed for the discussion and insights provided, highlighting the complexities of prompt engineering and the nuances of leveraging AI models for specific project goals.
Links mentioned:
How do you maintain historical context in repeat API calls?: Each time I make a call to the API it starts off with no prior context, unlike the chat.openai.com scenario. Is there a way to maintain state of the model during a session? response = openai.Completiā¦
OpenAI ā· #api-discussions (558 messagesš„š„š„):
-
Exploring GPTās Limitations on Large Texts:
@doublefelix
ventured into a discussion on how to effectively split a large block of text into smaller, manageable segments for paragraphing and grammar correction using GPT. Initially attempted with GPT-3.5, they faced challenges with the AI ignoring parts of the text or hallucinating content. -
Semantic Clustering and Paragraphing Strategies:
@darthgustav
suggested multiple strategies to aid@doublefelix
in his endeavor, including semantic clustering for paragraph breaks and leveraging Custom GPT or GPT-4 Turboās advanced capabilities and Python Tool for more accurate text processing. -
The Cost of Processing Large Texts: The conversation also touched upon the cost implications of processing large volumes of text.
@doublefelix
discovered that the processing costs with GPT-3.5 were higher than initially anticipated, prompting a consideration of GPT-4 Turbo despite its higher cost. -
NLP vs AI in Text Segmentation: Thereās an ongoing debate whether Natural Language Processing (NLP) tools or AI (specifically GPT models with enhanced context windows and Python capabilities) would serve better in segmenting and correcting large transcription files.
@doublefelix
is considering exploring NLP tools, found a potential package on PyPi, and remains open to utilizing GPT-4 with structured instructions for improved efficiency. -
A Quest for Automated Text Structuring: Towards the end,
@doublefelix
reflected on the journey, acknowledging the challenges faced and the insights gained from interacting with GPT models and structured instructions for segmenting large text files. Though some progress was made with GPT-3.5, there remains a pull towards exploring GPT-4ās capabilities or NLP tools for a more automated approach.
Links mentioned:
How do you maintain historical context in repeat API calls?: Each time I make a call to the API it starts off with no prior context, unlike the chat.openai.com scenario. Is there a way to maintain state of the model during a session? response = openai.Completiā¦
Nous Research AI ā· #ctx-length-research (7 messages):
-
Exploring Best Solutions for Extending Context Capabilities:
@cryptossssun
inquired about the best current methods for enhancing context capabilities, stirring a constructive dialogue among members.@_cherrry
recommended fine-tuning based on a discussed paper as a viable method. -
Debate on Mistralās Strategy Changes: Raising questions about Mistral Instruct v0.2ā,
@dreamgen
highlighted the disabling of the sliding window technique in favor of scalingrope_theta
, sharing the config.json to underline the shift. This prompted speculation on whether sliding window techniques were less effective for long contexts. -
MistralLite Mimics Mistral Strategy: Further inquiries by
@dreamgen
led to observations thatamazon/MistralLite
employs a similar approach to Mistral regarding context window tactics, keeping the discussion focused on evolving model configurations. -
Impressive Context Window Extension Achievements:
@stellaathena
showcased an extraordinary feat of extending LLaMA-2-7B-Chatās context window to 16,384 with minimal samples and training steps. This method brags about remarkable efficiency, drawing amazed reactions. -
SelfExtend Suggested as Fine-Tune-Free Option: When
@cryptossssun
asked for advice on extending context capabilities,@leontello
pitched SelfExtend as an innovative solution for those opting not to fine-tune. This introduces another angle to the ongoing discussion of enhancing model performance without intensive training.
Links mentioned:
config.json Ā· mistralai/Mistral-7B-Instruct-v0.2 at main: no description found
Nous Research AI ā· #off-topic (5 messages):
- Polarized Effects of Technology:
@ldj
shared the view that advancement in technology might polarize society further, with the most incapable becoming more degenerate and the most capable engaging in even more self-improvement. This reflects the ongoing discussion about the socio-technological impact on different demographic strata. - Unexpected Laughter with Swimming Cat: A sudden humorous turn was introduced by
@Error.PDF
with a Cat Swimming GIF, lightening the mood amidst more serious tech discussions. - Twitterās Mysterious Post Acceleration:
@fullstack6209
observed a significant increase in the rate of new posts appearing on Twitter, from 2-3 every 10 minutes to about 70 a minute, raising questions about changes in the platformās content delivery algorithms. - Speculations on Twitter Slowing Down AI: Following up,
@fullstack6209
shared suspicions about Twitter intentionally slowing down AI to manage or control content flow, reflecting concerns about the interaction between social media platforms and artificial intelligence technologies. - One-Click Quantization of LLMs to GGUF:
@pradeep1148
shared a YouTube video demonstrating how to easily quantize any hf LLMs to GGUF format in a single click, marking significant progress in the field of machine learning and neural networks efficiency.
Links mentioned:
- Cat Swimming GIF - Cat Swimming Poopsie - Discover & Share GIFs: Click to view the GIF
- AutoGGUF Quantize LLMs in GGUF format in one click.: Quantize any hf LLMS to GGUF format using the notebook provided by Maxim Labonne#llms #ml #ai #neuralnetworks #deeplearning #ggufhttps://colab.research.googlā¦
Nous Research AI ā· #benchmarks-log (2 messages):
-
Benchmarking Request for
Everyone Coder 33B Base - GGUF
:@231912337869635584
was asked by@benxh
to conduct a human eval benchmark on the Everyone Coder 33B Base - GGUF model. This model, created by rombo dawg, was quantised using Massed Compute and supports the new GGUF format, a replacement for GGML. -
Interest in Hermes Mixtral Performance:
@teknium
expressed a desire to see performance benchmarks for Hermes Mixtral but provided no further details or links.
Links mentioned:
TheBloke/Everyone-Coder-33B-Base-GGUF Ā· Hugging Face: no description found
Nous Research AI ā· #interesting-links (2 messages):
-
OpenAI Announces New Embeddings Models and API Updates:
@tsunemoto
shared a link detailing OpenAIās launch of new embedding models, alongside updates to the GPT-4 Turbo and moderation models. There will be lower pricing for GPT-3.5 Turbo, new API usage management tools, and a commitment that data sent to OpenAIās API will not be used for training their models. -
Introducing Genie for Content-grounded Data Generation:
@metaldragon01
highlighted a recently published paper on Genie, a novel method aiming to overcome the shortage of high-quality data for content-grounded generation tasks. Genie uses a three-stage process (Content Preparation, Generation, and a Filtering mechanism) to create task-specific examples that are natural and high-quality, notably for Long-Form Question-Answering (LFQA), summarization, and information extraction.
Links mentioned:
- Genie: Achieving Human Parity in Content-Grounded Datasets Generation: The lack of high-quality data for content-grounded generation tasks has been identified as a major obstacle to advancing these tasks. To address this gap, we propose Genie, a novel method for automatiā¦
- New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.
Nous Research AI ā· #general (361 messagesš„š„):
-
GPT-2 Inference on WebGL Explored:
@_3sphere
and@n8programs
discussed implementing GPT-2 inference in WebGL, withn8programs
sharing a detailed kernel for vector similarity computation using ThreeJS and GLSL. The conversation highlighted the potential for machine learning models to run directly in browser graphics pipelines. -
LLaMA Cloning Attempts and Discussion:
@balnazzar3047
and@euclaise
discussed creating a Mixtral model similar to LLaMA but encountered challenges understanding the model-building process. Helpful resources and explanations were shared, including a Colab notebook by@qnguyen3
for initializing new models. -
Debate on Model Efficiency:
@carsonpoole
shared updates on fine-tuning the phi2 model with RMSNorm and plans for further optimization, hinting at potential efficiency gains over traditional Transformer models. The approach sparked interest in comparing parallel residual architectures for inference and training speed. -
Word2Vec on Steroids with User Feedback:
@everyoneisgross
innovated on Word2Vec by introducing a method to refine and expand corpus data based on user inputs, utilizing Mistral Instruct for additional context. This approach, though simple, was praised for its effectiveness and potential in lightweight NLP tasks. -
Harnessing GPU for Advanced ML Computations:
@n8programs
detailed the use of WebGL for machine learning, explaining how 3D textures and spatial thinking enable computations beyond typical GPU limits. This method, seen as āhaxing on the back of video games,ā showcases creative leveraging of gaming hardware for scientific computing.
Links mentioned:
- EvalPlus Leaderboard: no description found
- Human vs. Machine: Intelligence per Watt: Contemplating the possibility that machines wonāt win everywhere all at once
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces: Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Manyā¦
- Google Colaboratory: no description found
- Recommendations on new 2 x RTX 3090 setup: Hi, Iām selling my old GTX 1080 and upgrading my deep learning server with a new RTX 3090. Iām also contemplating adding one more RTX 3090 later next year. Iāve read from multiple sources blower-styā¦
- Mixtral: no description found
- š¤ Transformers: no description found
- GitHub - cg123/mergekit at mixtral: Tools for merging pretrained large language models. - GitHub - cg123/mergekit at mixtral
- Growing Living Rat Neurons To Play⦠DOOM?: Head to https://squarespace.com/thethoughtemporium to save 10% off your first purchase of a website or domain using code: thethoughtemporium_________________ā¦
- Accelerating Systems with Real-time AI Solutions - Groq: Powered by hardware and software.USA-based. Available now.
- EVGA SuperNOVA 1600 P2, 80+ PLATINUM 1600W, Fully Modular, EVGA ECO Mode, 10 Year Warranty, Includes FREE Power On Self Tester Power Supply 220-P2-1600-X1: Ready for 4th Generation Intel Core Processors (C6/C7 Idle Mode) Introducing the EVGA SuperNOVA 1600 P2 power supply. This power supply raises the bar with 1600W of continuous power delivery and 92% ā¦
- Designs Beyond The Reticle Limit: Chips are hitting technical and economic obstacles, but that is barely slowing the rate of advancement in design size and complexity.
Nous Research AI ā· #ask-about-llms (35 messagesš„):
-
Specs for Finetuning CodeLlama 34B Debated:
@ganymede123
inquired about the ideal workstation specs for finetuning CodeLlama 34B, considering 4xA6000 GPUs.@teknium
responded that this setup would only suffice for a qlora, suggesting a full DGX might be necessary for complete finetuning. -
Troubleshooting T5 Fine-Tuning Issues:
@maxpappa
discussed problems with aligning a fine-tuned version of T5, experiencing deterministic outputs and steady reward-accuracies. Suggestions from@locutusque
and@carsonpoole
included avoiding paged 8bit Adam, considering numerical instability in T5, and clamping infs, especially in the encoder. -
Exploring LLMs for Offensive Cyber:
@useewhynot
sought recommendations for LLMs suitable for offensive cyber operations or Capture The Flag (CTF) competitions.@kenakafrosty
and@georgejrjrjr
recommended WhiteRabbitNeo models available on HuggingFace, highlighting their focus on such tasks. -
Libraries for LLM Fine-Tuning: In response to
@moconna
asking about preferred libraries for LLM fine-tuning,@kenakafrosty
mentioned using trl and expressed a preference for axlotl, indicating its benefits for certain applications. -
Choosing the Best Coding LLM and Fine-Tuning Advice:
@findmyke
queried about the current best LLM for coding, with@.ben.com
directing them to the EvalPlus leaderboard for comparison. Moreover,@moconna
inquired about essential hyperparameters for fine-tuning Mistral with Llama Factory, indicating a focus on learning rate and seeking suggestions for defaults or templates.
Links mentioned:
- EvalPlus Leaderboard: no description found
- WhiteRabbitNeo/WhiteRabbitNeo-33B-v1 Ā· Hugging Face: no description found
- WhiteRabbitNeo/WhiteRabbitNeo-13B-v1 Ā· Hugging Face: no description found
- WhiteRabbitNeo - A co-pilot for your cybersecurity journey: no description found
Nous Research AI ā· #project-obsidian (3 messages):
- Seeking Simple 3b Obsidian Python Script:
vic49.
is in search of a simple Python script that utilizes the transformers library, with remote code execution enabled, for working with the 3b Obsidian model. - Code Refactoring in Progress:
qnguyen3
responded by promising a refactoring of the entire code to ensure compatibility with the latest llava repo, addressingvic49.
ās request.
OpenAccess AI Collective (axolotl) ā· #general (219 messagesš„š„):
-
Fine-tuning Mistral for Norwegian:
@henriklied
is fine-tuning Mistral on a collection of 100k articles for title generation in Norwegian, with specific model parameters shared. However,@le_mess
suggests stopping at 4 epochs to avoid overfitting and advises that 10 epochs might be too many iterations for a limited dataset. -
shareGPT Shortcomings in Handling Long Conversations:
@c.gato
raises concerns about shareGPTās handling of conversations that exceed a predefined context length, indicating that long conversations are dropped rather than trimmed to maintain context length.@le_mess
confirms this behavior is by design in Axolotl for anything other than completion tasks. -
Discussion on Effective Chat Representations:
@suikamelon
and@c.gato
debate the efficacy of ChatML versus raw chat formats for managing conversation contexts in AI models, with preferences varying based on token efficiency and flexibility. Additionally,@dreamgen
and@c.gato
discuss the potential benefits of alternative representations like BlockML and IRC formatting. -
Training Model Considerations:
@dreamgen
inquires about single-GPU performance for H100 vs A100 in the context of training models like QLoRA, with@c.gato
noting the cost-efficiency of H100 for a specific task. Discussions also touched on changing rope theta and its relation to scaling techniques in model training. -
Challenges with Deploying AI Models:
@dangfutures
seeks assistance for hosting their model, highlighting difficulties with serverless deployment and discussing potential solutions. Conversations highlight the complexities involved in selecting and deploying the appropriate hosting solutions for AI models.
Links mentioned:
- Pullrequest GIF - Pullrequest - Discover & Share GIFs: Click to view the GIF
- axolotl/deepspeed_configs/zero3.json at main Ā· OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
- more checks and fixes for deepspeed and fsdp by winglian Ā· Pull Request #1208 Ā· OpenAccess-AI-Collective/axolotl: no description found
OpenAccess AI Collective (axolotl) ā· #axolotl-dev (1 messages):
-
Active Development Leads to Message Deletion:
@gahdnah
deleted their previous message after noticing the area of discussion is still under active development, following a check of the latest commits. No specific details about the development area or commits were provided. -
A Grant Celebration: The axolotl-dev community celebrated receiving a grant with enthusiastic emojis. The message contained no details on the grantās purpose or the grantor.
OpenAccess AI Collective (axolotl) ā· #general-help (47 messagesš„):
-
Solutions for Training Qlora Without Bitsandbytes:
@matanvetzler
inquired about training Qlora models without Bitsandbytes for compatibility with VLLM.@le_mess
and@stefangliga
suggest alternatives like using fp16 or AutoAWQ for quantization, and merging trained Qlora models into the base model. Additionally,@stefangliga
shared a GitHub link for merging and a tweet by Tim Dettmers advocating for model quantization before merging. -
Discussion on Model Merging Strategies:
@nanobitz
and@c.gato
engaged in discussions about the efficacy and potential issues of merging models with different quantization adapters, indicating a slight improvement in performance but also highlighting complexities in merging multiple adapters.@c.gato
mentions a 1-2% improvement but advises caution due to potential issues. -
Training and Evaluating Models with Custom Prompts:
@sadaisystems
sought advice on configuring custom prompts and dataset formats for model training. Concerns were raised about unusually low training losses, to which@caseus_
and others suggested this might be expected behavior for deterministic tasks like SQL queries. The conversation emphasizes the importance and challenges of interpreting model training performance. -
Continuous Pretraining and Benchmark Evaluation on Mistral: Queries about continuing pretraining on Mistral and evaluating models during training were raised by
@nickbro0355
and@sadaisystems
.@caseus_
provided clues on using continuous pretraining and introduced benchmark evaluation during training with datasets like dharma-1 on HuggingFace, recommending to setdo_bench_eval: true
and suggestingbench_dataset: pharaouk/dharma-1/dharma_1_full.json
for relative performance improvement checks. -
Documentation and Configuration Tips for Axolotl Users: Points were noted on the absence of certain parameters (e.g.,
bench_dataset
) in the main README of axolotl, indicating potential areas for documentation improvement. The discussions underpin the benefits of these configurations for test runs and the iterative nature of model development within the community.
Links mentioned:
- pharaouk/dharma-1 at main: no description found
- qlora/qmerge.py at main Ā· jondurbin/qlora: QLoRA: Efficient Finetuning of Quantized LLMs. Contribute to jondurbin/qlora development by creating an account on GitHub.
- GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
OpenAccess AI Collective (axolotl) ā· #datasets (7 messages):
-
New DPO Dataset Announced:
dangfutures
shared a link to a new dataset on Hugging Face, used for training the Snorkel model with emphasis on not using external LLM responses, but only prompts from UltraFeedback. The methodology involves generating 5 response variations for each prompt, using LLM for response reranking, and applying Direct Preference Optimization (DPO). -
Mistral 7 gets a Tune-Up:
dangfutures
expressed enthusiasm about Mistral 7 being fine-tuned, suggesting significant improvements or updates to the model. -
ALPCA Numbers Discussed:
dangfutures
initially mentioned a numerical figure in relation to ALPCA, later clarifying it to be a 34 percent figure, which is deemed as an improvement over old GPT-4 metrics. -
Reaction to Discussion:
_dampf
responded to the ongoing discussion with a GIF from Tenor, implying a humorous or expressive reaction to the shared information about datasets and model performance.
Links mentioned:
- Sure Jennifer Lawrence GIF - Sure Jennifer Lawrence The Mocking Jay - Discover & Share GIFs: Click to view the GIF
- snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset Ā· Datasets at Hugging Face: no description found
OpenAccess AI Collective (axolotl) ā· #rlhf (2 messages):
- Seeking Clarity on DPO Training Plots: User
noobmaster29
asked if there are any resources available for understanding the dpo training plots, implying a need for documentation or guides to interpret data from dpo training sessions. - Troubleshooting DPO Dataset Issues:
noobmaster29
inquired about the necessary components of a dpo dataset, specifically if anything beyond prompt/input and a chosen rejected pair is required. They mentioned having issues with dataset processing despite including these columns.
OpenAccess AI Collective (axolotl) ā· #community-showcase (1 messages):
pradeep1148: https://www.youtube.com/watch?v=wlPxEq_Mtkc
LM Studio ā· #š¬-general (118 messagesš„š„):
-
Proxy Support Inquiry and Frontier Exploration: Users expressed a need for proxy settings in LM Studio due to inability to use the model search function, particularly
@laooopooo_02864
mentioned this limitation in regions where HuggingFace is blocked. Wildcat_aurora shared a GitHub link to LM_Chat_TTS_FrontEnd, facilitating interface interaction with LM Studio models through text-to-speech functionality from a mobile device. -
LM Studio and Model Loading Challenges: Various users encountered issues with LM Studio, such as
@xmiruso
and@bagua
facing challenges with GPU detection and model loading errors, respectively. Solutions ranged from ensuring GPU offload settings to switching to alternative model versions. -
Interactive Learning Environments in the Classroom:
@fabguy
provided insights on setting up LM Studio for classroom use, suggesting the necessity of a web interface on a server and an OpenAI-compatible client for student interaction. They also recommended checking out an Awesome LLM Web UI GitHub repository for frontend options. -
Parallel Model Operations and Support Concerns:
@therealtrebor
inquired about running multiple models in parallel, revealing through testing that itās beneficial for specific use cases, while heyitsyorkie indicated this requires running multiple LM Studio instances. Users, including@Leo - Moon Boyz šššø
, sought ways to contact support for troubleshooting, directed to use specific error-reporting channels. -
Silly Tavern Misconceptions and Data Privacy Discussions: The community cleared up misconceptions about Silly Tavern costing money, with
@ptable
and@technot80
confirming its free status. Discussions on ensuring LM Studioās data privacy involved suggestions like operating in a docker or VM environment with no external internet access, with@agcobra1
seeking alternative assurances and@bigdave7541
suggesting the use of wireshark for validation.
Links mentioned:
- GitHub - JShollaj/awesome-llm-web-ui: A curated list of awesome Large Language Model (LLM) Web User Interfaces.: A curated list of awesome Large Language Model (LLM) Web User Interfaces. - GitHub - JShollaj/awesome-llm-web-ui: A curated list of awesome Large Language Model (LLM) Web User Interfaces.
- GitHub - FriendofAI/LM_Chat_TTS_FrontEnd.html: LM_Chat_TTS_FrontEnd is a simple yet powerful interface for interacting with LM Studio models using text-to-speech functionality. This project is designed to be lightweight and user-friendly, making it suitable for a wide range of users interested in exploring voice interactions with AI models.: LM_Chat_TTS_FrontEnd is a simple yet powerful interface for interacting with LM Studio models using text-to-speech functionality. This project is designed to be lightweight and user-friendly, makinā¦
LM Studio ā· #š¤-models-discussion-chat (54 messagesš„):
-
C++ Redistributables Solve Model Loading Error:
@rparada
encountered repeated errors when loading various models like Stable Code, Deepseek, and Codellama. The issue was resolved following@heyitsyorkie
ās advice to update C++ redistributables, demonstrating the importance of keeping system components up-to-date for model compatibility. -
No Multimodal Model Rivals GPT-4 Yet: In response to
@mudf00t
asking for a multimodal model comparable to GPT-4,@heyitsyorkie
clarified that there are no comprehensive models yet that combine vision and text generation, highlighting the current limitations in the AI model landscape. -
Azure OpenAI Service for GPT-4 Usage Discussed:
@mickael6102
shared their companyās use of GPT-4 via Azureās OpenAI service, sparking a conversation on cloud versus local model deployment led by@vbwyrde
. Concerns around data privacy, costs, and dependency on third-party cloud services were major discussion points. -
InternLM Sparks Interest:
@vbwyrde
highlighted a new model called InternLM, noting its claims of an 11 percent better reasoning ability than GPT-4 and advanced function-calling capabilities. This incited interest as an alternative to other models, with@mickael6102
considering it for future projects. -
Debate Over Corporate Open Source Contributions: Discussion led by
@vbwyrde
,@.gumdro
, and@fabguy
pondered the motivations behind Metaās release of LLaMA2, speculating on strategic benefits such as creating standards and undercutting competitors. This conversation raises questions about the relationship between open source contributions and corporate strategy in the AI field.
Links mentioned:
- Mark Zuckerberg Adjust GIF - Mark Zuckerberg Adjust Facebook - Discover & Share GIFs: Click to view the GIF
- internlm (InternLM): no description found
LM Studio ā· #š§ -feedback (4 messages):
- Bug Alert in MoE Model Setting:
@msz_mgs
reported a bug where changing from a 4X MoE model to a 2X MoE model withnum_experts_used=4
generates an error message that prompts a restart of the application. This seems to cause an inability to adjust the setting without restarting. - Investigation Initiated on MoE Model Issue: In response to
@msz_mgs
ās report,@yagilb
acknowledged the problem and requested more information about the 2x MoE and 4x MoE models involved to aid in resolving the issue. - Clarification Offered on MoE Model Configuration:
@dagbs
provided a tip suggesting that for a 4X MoE model, the number of experts intended to be set is 2, offering insight into possibly intended configurations. - Performance Slowdown on 0.2.11 Update:
@golangorgohome
experiences significant performance issues with the 0.2.11 update on Windows 11, including delayed search icon response and slow search results, despite having a high-speed internet connection.
LM Studio ā· #š-hardware-discussion (11 messagesš„):
- GPU Preferences for Best Price/RAM Discussed:
@gitanos
inquires about the efficiency and value of a 4060ti with 16 GB RAM, questioning if itās the best choice regarding price for RAM at the moment.@heyitsyorkie
suggests considering a used 3090 on eBay instead, as itās only slightly pricier in the UK and may offer better value. - Technical Glitches with e0.211:
@madan.pandit
reports that models previously operational have ceased to function, specifically questioning othersā experiences with the e0.211 updates. It was noted by@heyitsyorkie
that GGML models have been deprecated in favor ofllama.cpp
in the latest builds, affecting their compatibility. - Memory Issues with Gguf Models Reported:
@madan.pandit
mentions encountering memory insufficiency errors with gguf models, indicating potential compatibility or resource allocation issues following recent updates. - Mac Studio as a Recommended Hardware for LLMs:
@heyitsyorkie
recommends purchasing a fully loaded M2 Mac Studio for running large language models (LLMs), pointing out its compact size, efficiency, and design aesthetics. - Debate Over P40 and M40 GPUs for Multi-GPU Setups: In a discussion about using P40 and M40 GPUs in conjunction,
@docorange88
inquires about their collective performance, while@wildcat_aurora
and@rugg0064
suggest P40s are a good investment, but deem M40s not worthwhile, highlighting plans to acquire P40s soon.
LM Studio ā· #š§Ŗ-beta-releases-chat (47 messagesš„):
-
Choosing the Right LLM is an Art:
@mmonir
outlines criteria for selecting the optimal Large Language Model (LLM), emphasizing the importance of use cases like writing, coding, and summarizing, along with considerations such as quantization, hardware capabilities, and user experience. They also suggest leveraging leaderboards and LLM demos for a more informed decision, and to seek user feedback from various platforms. -
VRAM Misread on New Install:
@mudf00t
reports an issue where LM Studio is displaying 0 VRAM for an RTX 3090 on a new Nobara installation, to which@yagilb
provides a workaround link specifically directed at Nvidia users, not applicable for Mac M2 silicon. -
Troubleshooting Version Compatibility:
@pdg
faces issues with models not working with the final new version of LM Studio, experiencing unusual system behavior on their MacBook M2. They receive assistance from@yagilb
including a link to revert to a previous version, 0.2.10, hinting at potential software regression issues. -
Model Loading Woes and Workarounds: After downgrading LM Studio versions,
@pdg
continues to encounter errors with model loading but finds a temporary fix by initiating interactions with shorter sentences. This workaround allows for longer inputs to be processed successfully later on. -
Context Length and Overflow Policy Insights:
@mattjcly_55150
inquires about the context length settings@pdg
might have used, suggesting adjustments in context overflow policies could resolve issues related to input length limitations. This implies that initial input length exceeding the set context length could be a root cause of the encountered errors.
LM Studio ā· #autogen (1 messages):
- Broken Pin Causes Tears:
sunglasses.emoji
reported that the pinned link for empty strings in autogen studio is broken, leading to a cry for help. They seek guidance on how to do a custom agent class in Autogen Studio.
LM Studio ā· #open-interpreter (10 messagesš„):
-
Consistent Framework Issues with Model Utility:
@pefortin
highlights issues with multiple AI frameworks, including Open Interpreter, memGPT, and crewai, specifically regarding the inability of models to appropriately use available tools. They mention using mid to large models like mixtral8x7B and deepseek coder 33B, dismissing size as the problem. -
Adventures in Model Testing with RTX 3090:
@mudf00t
shares theyāre testing various models, leveraging the power of an RTX 3090 to handle larger models, pointing to the direct experience with hardware advantages in model experimentation. -
API Knowledge Gap Frustrates Development:
@mudf00t
expresses frustration over models, including those provided by OpenAI, not being up-to-date with the current API, leading to context issues during app development processes. -
Integrations and Fine-Tuning on the Horizon:
@222gate
notes the discontinuation of memgpt integration and shares plans to fine-tune a Mistral model for specific function calls, suggesting this approach may resolve some operative issues. -
Mistralās Creative Hallucinations:
@mudf00t
observes Mistral creating fictitious directory structures and code, underlining the modelās tendency to hallucinate rather complex outputs beyond the expected operational tasks.
Mistral ā· #general (163 messagesš„š„):
-
GPU Rentals for AI Summarization Efforts:
@mrdragonfox
highlighted that services like runpod, vast, and lambda offer GPUs by the hour for rental, suggesting users leverage these for their AI work. They also mentioned Kaggle as a free resource offering 2 T4 GPUs for 30 hours per week. -
Subscription Issues and Support Contacts:
@glorfsf
experienced issues with changing the usage limit options in their subscription, which@mrdragonfox
identified as a default limit issue and advised contacting [email protected] for assistance. -
Mistralās Use Cases and Integration with Other Frameworks: In a discussion on how Mistral models are utilized,
@ethux
shared use cases around finetuning for customer support automation and mentioned an interesting Embeddings model, showcasing the diverse applications of Mistral models beyond just competing with GPT-4. -
Exploring Mistralās API Limitations and Enhancements: Users discussed difficulties related to API token costs, tokenization questions, and specific AI model performance, with
@mrdragonfox
and others providing insights on how models work and proposing solutions for better model utilization. -
Technical Debates on Model Selection and Optimization: An engaging conversation emerged around the effectiveness of GitHub Copilot, with
@mrdragonfox
and@i_am_dom
debating its underlying technology, showcasing the communityās deep involvement in understanding and optimizing AI model performance.
Links mentioned:
- Reddit - Dive into anything: no description found
- GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs: A fast inference library for running LLMs locally on modern consumer-class GPUs - GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs
- intfloat/e5-mistral-7b-instruct Ā· Hugging Face: no description found
- TuringsSolutions/PFAF750 Ā· Datasets at Hugging Face: no description found
Mistral ā· #ref-implem (9 messagesš„):
- Clarifying Purpose between Training and Finetuning:
@ethux
queries if a discussion is about training or finetuning, indicating the significance of understanding the context when considering system requirements. - Memory Requirements for Datasets:
@ethux
discusses that while specific requirements are unclear, 64GB of memory should suffice for unknown dataset sizes, pointing to the importance of understanding memory needs in relation to dataset size. - Memory Footprint for Mistralās 4bit Inference: According to
@ethux
, Mistral requires at least 26GB for 4bit inference, signaling a significant memory requirement for inference tasks. - Optimization Tips for Model Efficiency:
@mrdragonfox
compares exllama and a regular 4 bit transformer, suggesting that Exllama is more efficient in terms of memory usage, therefore highlighting different optimization strategies. - Quantization Strategies for Reducing Memory Footprint:
@mrdragonfox
proposes using exllamav2 as a quantization strategy over the BnB 4 bit approach, suggesting it as a better alternative for memory efficiency.
Mistral ā· #finetuning (3 messages):
-
Translation Metrics Inadequate for LLM Evaluation:
@adrienbufort
states that BLEU and Rouge metrics, commonly used for translation performance evaluation, are not useful for evaluating large language models (LLMs) or instruction tuning LLMs. -
ELO-like Evaluation Tops for LLM:
@adrienbufort
recommends ELO-like evaluation system, akin to chess rankings, as being closest to human preference for LLM evaluation, although it requires human input. They describe it as the best available method. -
Multiple Choice and LLM Evaluation Techniques: They also mention MMLU (Multiple Choice Question Evaluation) and Alpaca Eval (Alpaca Evaluation) as other methods for LLM evaluation. MMLU allows for clear right or wrong answers, while Alpaca Eval involves one LLM evaluating anotherās answers.
-
Normalized Alpaca Eval Available:
@akshay_1
mentions that a normalized version of Alpaca Eval is now available on the market, indicating advancements in methods for LLM evaluation.
Mistral ā· #showcase (1 messages):
- Innovative AI Browser Queries Introduced: User
@sublimatorniq
showcased a tool that allows AI to respond with DOM node references, making browser queries more context-aware. This tool is compatible with MistralAI and aims to enhance the interaction with web content.
Links mentioned:
no title found: no description found
Mistral ā· #random (8 messagesš„):
- New Member Seeks RAG Application Insight:
@xrdg
, a newly joined member from š¬š¹, raised an inquiry about prompt structure for their RAG application and sought for a dedicated support channel. - DSPy Suggested for Prompt Optimization:
@akshay_1
recommended DSPy as a tool to optimize the prompt structure, which was appreciated by@xrdg
. - Details on RAG Stack Provided: Upon further query by
@akshay_1
,@xrdg
disclosed their RAG stack consists of langchain, chroma, and Mistral 7B and shared a link to a guide on prompting Mistral 7B with examples, tips, and relevant reading materials. - RAG Stack Potential for Optimization Highlighted:
@akshay_1
observed that@xrdg
ās RAG stack could be further optimized, inquiring if the project was a hobby project or planned for production.
Links mentioned:
Prompt Engineering Guide: A Comprehensive Overview of Prompt Engineering
Mistral ā· #la-plateforme (35 messagesš„):
-
Early Stopping Predicament Persists: Users
@digitalphotographer
and@sublimatorniq
exchanged experiences on an early stopping issue with Mistral, noting the problem occurs even without control tokens in square brackets.@digitalphotographer
further clarified that their prompts are plain strings, lacking control sequences or special characters. -
Technical Support Suggested for Early Stopping:
@mrdragonfox
advised@digitalphotographer
to contact Mistral support ([email protected]) and provide logs/full examples for further assistance.@sophiamyang
also offered help, asking for reproducible examples for the team to investigate. -
Billing Page Bug Detected: User
@ewanhc
and others discovered a bug where monthly usage limits on the billing page revert to ā¬150 upon refresh, despite attempts to set a different limit.@ethux
and@fersingb
confirmed experiencing the same issue, and@sophiamyang
advised contacting support with details. -
Mistral API Hosting Location Inquiry:
@loicboutet
inquired about where the Mistral API is hosted. It was clarified by@mrdragonfox
and further confirmed by@loicboutet
through the privacy page that the hosting is done in Europe, specifically on Azure in Sweden. -
Bug Identified with āmax_tokensā Parameter:
@mrxavierx
reported a bug with the Mistral API endpoint/completion
when themax_tokens
field is set to 1, which leads to a 500 internal server error rather than a 1 token response or a meaningful validation error. An issue was created for this on GitHub: BUG: API /completion endpoint returns 500 (server error) when sending āmax_tokenā = 1.
Links mentioned:
- BUG: API /completion endpoint returns 500 (server error) when sending āmax_tokenā = 1 Ā· Issue #122 Ā· mistralai/mistral-src: While I was playing with the API endpoint /completion I found out a bug with the āmax_tokensā body field when itās set to 1. Instead of returning 1 token response or a validation error, ā¦
- no title found: no description found
Eleuther ā· #general (29 messagesš„):
-
Seeking Fine-tuning Code for Metaās SAM:
@the_alt_man
asked for a fine-tuning codebase for Metaās SAM (Segment Anything) model but ended up using AutoGluon due to the lack of such functionality in the original codebase. They highlighted that AutoGluon uses Lightning, which works for GPUs but not for TPUs. -
Exploring Multi-node Training Without Infiniband:
@elyxlz
inquired about the feasibility of multi-node training without Infiniband, contemplating a merging strategy every few training steps. The discussion led to a shared paper on distributed optimization (DiLoCo) that potentially addresses this concern. -
The Pile Dataset Access Quest:
@sk5544
sought guidance on accessing The Pile dataset, being directed by@stellaathena
and@elyxlz
through private discussions and links, which highlights a community ready to support information sharing. -
Analog Clock Dataset Challenge:
@stellaathena
proposed a creative project to@pinconefish
, suggesting the creation of a dataset featuring analog clocks showing diverse times for a study in out-of-domain generalization. The particularity of the dataset focusing on clocks showing 10:10 piqued interest as a starting point for training a text-to-image (T2I) model. -
Exploring Training Quality and Data Defects Through Synthetic Datasets: Following the analog clock discussion,
@wonkothesensible
proposed exploring potential modelsā mode collapses through checkpoints analysis, suggesting the creation of a synthetic dataset of clocks showing various times for broader training and defect identification purposes.
Links mentioned:
DiLoCo: Distributed Low-Communication Training of Language Models: Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of tightly interconnected accā¦
Eleuther ā· #research (125 messagesš„š„):
-
Byte-Level Transformers Spark Debate: A discussion initiated by
@main.ai
regarding the efficiency of byte-level transformers like ByT5, sparked further conversation with@the_random_lurker
about unfair comparisons in sequence length and the real-world benefits of additional context in models like Mamba. The topic touched upon in the MambaByte paper transitioned into the effectiveness and fairness of comparing token to byte sequence lengths. -
Exploring Proxy-Tuning for LLMs:
@digthatdata
shared a new approach called proxy-tuning which proposes a lightweight decoding-time algorithm designed to tune large language models (LLMs) using a smaller model as a proxy. This method aims to close the performance gap between directly tuned models and their black-box counterparts without direct access to model weights. -
Mambaās Rejection Raises Eyebrows: Discussion around the rejection of the Mamba paper with scores of 8/8/6/3 out of 10 led
@stellaathena
and others to critique the meta review process. The rejection has sparked dialogue about the implications for the academic community and the pursuit of state of the art (SOTA) benchmarks. -
Chess Engine Superiority and AI Advances: Talks revolved around the current SOTA in various fields, including chess where
@clockrelativity2003
noted Stockfish remains a top contender.@stellaathena
challenged members to name recent SOTAs across multiple domains, highlighting gaps in awareness within the ML community regarding advancements. -
AI-Driven Evolution of Chess: A blog post discussed by
@clockrelativity2003
imagines chess in 2024, redefined by AI advancements, featuring new engines like Pandora and augmented by digital tools for strategic enhancements. This conversation piece reflects on the interface between traditional games and next-gen AI technology.
Links mentioned:
- Transformers and Cortical Waves: Encoders for Pulling In Context Across Time: The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the worldās attention. The crucial computational mechanism underlying their performancā¦
- MVDream: Multi-view Diffusion for 3D Generation: We introduce MVDream, a multi-view diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can aā¦
- MoE-Infinity: Activation-Aware Expert Offloading for Efficient MoE Serving: This paper presents MoE-Infinity, a cost-efficient mixture-of-expert (MoE) serving system that realizes activation-aware expert offloading. MoE-Infinity features sequence-level expert activation traciā¦
- CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition: Multilingual speech processing requires understanding emotions, a task made difficult by limited labelled data. CLARA, minimizes reliance on labelled data, enhancing generalization across languages. Iā¦
- Tuning Language Models by Proxy: Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become incā¦
- Evaluating the Medical Knowledge of Open LLMs - Part 1 ā MedARC: In this MedARC blog post, we compare generalist and medical domain-specific Large Language Models (LLMs) like GPT-4, Mistral, and Llama, and we evaluate their performance on MultiMedQA tasks for medicā¦
- The Quantum Leap of Checkmate: Chess in the Age of AI The year is 2024: The year is 2024. Robots roam streets, holograms flicker in living rooms, and self-driving cars navigate rush hour with the grace of a seasoned taxi driver. Yet, on a humble wooden board, an ancient dā¦
- | bioRxiv): no description found
- Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom: Although AlphaFold2 (AF2) and RoseTTAFold (RF) have transformed structural biology by enabling high-accuracy protein structure modeling, they are unable to model covalent modifications or interactionsā¦
Eleuther ā· #lm-thunderdome (2 messages):
- Possible Merge for a Fix:
@hailey_schoelkopf
mentioned considering merging a fix after discovering surprising behavior in their implementation. They plan to try this out when they have a chance. - Adding Weights and Biases Support to lm-evaluation-harness:
@hailey_schoelkopf
shared a GitHub pull request #1339 by@ayulockin
for adding Weights and Biases support to thelm-evaluation-harness
, questioning the best location for thewandb.py
file.
Links mentioned:
feat: Add Weights and Biases support by ayulockin Ā· Pull Request #1339 Ā· EleutherAI/lm-evaluation-harness: In #359 @parambharat did proposed to add support for W&B logging. However it was done before the big refactor that got in. As a user of both lm-evaluation-harness and wandb, I have opened this PR ā¦
Eleuther ā· #gpt-neox-dev (16 messagesš„):
- QLoRA Tuning Confusion Cleared:
@kenakafrosty
inquired about issues tuning neoX 20b with QLoRA but was informed by@stellaathena
that the GPT-NeoX library does not support QLoRA. Kenakafrosty was actually using a combination of trl, transformers, and peft libraries. - Clarifying the Right Help Channels: Following
@stellaathena
ās advice,@kenakafrosty
realized the mistake and acknowledged the confusion. Stellaathena suggested opening an issue on the relevant GitHub for help with those libraries. - Catboy_slim_ Addresses Testing Challenges:
@catboy_slim_
discussed the challenges of testing substantial updates like Python, PyTorch, and CUDA versions. Highlighted the inability to manually test every branch, emphasizing the need for functional tests. - Seeking Solutions for PyTorch Testing Issues: In addressing the challenges with running pytest on torch code,
@catboy_slim_
opened an issue on GitHub regarding tests failing when run with pytest āforked, suggesting this might be a widespread issue beyond their control. - Efforts to Facilitate Validation:
@tastybucketofrice
offered compute access to<@337128969059172353>
and invited@catboy_slim_
to DM for access as well, aiming to support further testing of the changes made. Additionally, responded to concerns about pytest issues by suggesting DeepSpeed as a potential model for successfully integrating CUDA with pytest in forked processes.
Links mentioned:
Tests fail when run with pytest āforked Ā· Issue #1132 Ā· EleutherAI/gpt-neox: Describe the bug When tests are run with pytest āforked per the instructions in /test/README.md, a large number of tests fail with the error: RuntimeError: Cannot re-initialize CUDA in forked subpā¦
Perplexity AI ā· #general (87 messagesš„š„):
-
Perplexity Proās Exclusive Features Unveiled: Users such as
@icelavaman
and@mares1317
discussed the benefits of Perplexity Pro over the regular version, highlighting features like practically unlimited Copilot queries, the ability to attach images and files for exploration with models like Claude 2.1 and GPT-4, and access to powerful AI models. This info was supported with a link to the Perplexity Pro FAQ. -
Privacy Concerns Over Data Retention Stir Discussion:
@emisaurus_hex
and@firesonwires
raised concerns about Perplexityās privacy policy, specifically the retention of search history and personal data until account deletion. Expert@icelavaman
clarified that deleted threads are removed from servers after 30 days, yet users remain cautious about the implications for privacy. -
Clarification Sought on Thread Deletion Policy: Amidst confusion over data retention,
@icelavaman
, a self-identified Perplexity expert, reassured users like@emisaurus_hex
that deleted threads are indeed deleted from servers after 30 days. However, users expressed a desire for clearer privacy policies on the Perplexity website. -
Debating the Ethics of Data Storage: The discussion turned philosophical as users like
@yellephen
pondered the value of aggregated search data, while others like@firesonwires
expressed discomfort with the potential for invasive profiling based on search history. This conversation underscores the complexity of privacy in the digital age. -
Technical Support Queries Engage Community: Users reached out for support on diverse topics, including recovering old bookmarked threads (
@skyhunz
), file upload limits for Pro users (@lukas8a
), and applying Perplexity Pro credit codes (@odobostudio
). The community and Perplexity representatives like@icelavaman
and@danielagmz888
provided guidance, demonstrating the active engagement of the support team within the platform.
Links mentioned:
- What data does Perplexity collect about me?: Explore Perplexityās blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
- What is Perplexity Pro?: Explore Perplexityās blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
- Perplexity Blog: Explore Perplexityās blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
- Perplexity - AI Companion): Ask anything while you browse
Perplexity AI ā· #sharing (4 messages):
- Perplexity Bridges Google Search and OpenAI:
jsudaniel
highlighted the CEO of Perplexityās unique position, merging the capabilities of Google Search and OpenAI. They shared a YouTube video titled āI use Perplexity MORE than Google and ChatGPTā, shedding light on why Perplexity is preferred for content creation over other tools. - Perplexity as a Learning Tool for Smartsheet:
nicknalbach
shared their experience of using Perplexity for learning Smartsheet, mentioning that despite being well-versed in Excel, transitioning to Smartsheet was challenging. They found Perplexity exceptionally helpful in providing answers to every problem encountered, aiding in building their Smartsheet. - Astronomy Teaching Aide via Perplexity:
coloradocomplex
noted using Perplexity to help explain concepts in their astronomy class, showcasing Perplexityās utility in the educational sector. They also shared a link to a specific concept explanation on Perplexity, although the specific concept discussed was not mentioned.
Links mentioned:
I use Perplexity MORE than Google and ChatGPT: Main Takaways From this Video: āI use Perplexity more than ChatGPT, BARD, and Microsoft Copilots for five main reasons, including its use in content creationā¦
Perplexity AI ā· #pplx-api (5 messages):
-
Website version outperforms API: User
@benhirap
mentioned that the website version of a certain tool codes significantly better compared to its API counterpart. -
Curiosity about labs vs. API differences:
@stijntratsaert_01927
is looking for information on the default parameters used by labs to achieve similar results when using the API. -
Double charge issue goes unresolved:
@aiagileguy
detailed an experience with [email protected] regarding a double charge. Despite reaching out more than 1-2 business days ago, they have yet to receive a resolution.
HuggingFace ā· #announcements (3 messages):
- Community Highlights #42 Echoes Interest: @lunarflu pointed out the increase in content creators and posts on HuggingFace, highlighting the platformās focus on Machine Learning and its quieter nature compared to Twitter or LinkedIn. They mentioned the possibility of adding more members to their organization, implying an invitation to those interested in AI and ML. Join the org and explore more.
- Lightweight Open LLM Leaderboard Unveiled: The leaderboard covers various aspects including weight types, precisions, licenses, parameters, architectures, and even specifies exclusions like merged, flagged, or MoE models. Dive into the cosmos-arena.
- LM Interpretability Survey Highlights: H. Luo and L. Speciaās survey on explainability for LLMs categorizes current approaches and discusses their applications, signaling a significant stride towards understanding and leveraging the interpretability of pre-trained Transformer-based LMs. Read the full post.
- Innovative Projects by Content Creators: From the CheXRay space for testing the
/StanfordAIMI/CheXagent-8b
model to generating image embeddings for datasets, the content showcases practical utilities and advancements in the field. Similarly, projects like ZeroGPU for the whisperspeech model, a Python module for adding steering vectors, and the Text2SQL model for DuckDB exemplify the diverse explorations within the community. - AI Alignment with TenyxChat-8x7B-v1: Introduced as a part of the TenyxChat series, this model represents a fusion of preference tuning and advanced fine-tuning techniques to serve as a useful assistant, sporting a score of 8.4 on MT-Bench. Explore TenyxChat.
Links mentioned:
- social-post-explorers (Social Post Explorers): no description found
- Cosmos Arena: no description found
- @gsarti on Hugging Face: āš Todayās pick in Interpretability & Analysis of LMs: From Understanding toā¦ā: no description found
- CheXRay - a Hugging Face Space by Tonic: no description found
- GitHub - TonyAssi/HF-Embed-Images: Generates image embeddings for š¤ Datasets: Generates image embeddings for š¤ Datasets. Contribute to TonyAssi/HF-Embed-Images development by creating an account on GitHub.
- @Tonic on Hugging Face: āhey there folks , work in progress, but basically celebrating the release ofā¦ā: no description found
- not-lain/TunBERT Ā· Hugging Face: no description found
- @mehd-io on Hugging Face: āWe just released the first Text2SQL model for DuckDB š¦š§ You can try it outā¦ā: no description found
- [@Tonic on Hugging Face: āš Hi there folks,
I launched my first competition !
Goal : Use AI toā¦ā](https://huggingface.co/posts/Tonic/783827682062088): no description found
- @gsarti on Hugging Face: āš Todayās pick in Interpretability & Analysis of LMs: Model Editing Can Hurtā¦ā: no description found
- ClovenDoug/small_128_all-MiniLM-L6-v2 Ā· Hugging Face: no description found
- Deepfake Detection - a Hugging Face Space by not-lain: no description found
- [@vicgalle on Hugging Face: āCan you merge models of different sizes? āļø
Well, yes, if the models areā¦ā](https://huggingface.co/posts/vicgalle/320544784279721): no description found
- tenyx/TenyxChat-8x7B-v1 Ā· Hugging Face: no description found
- AI Lineage Explorer: A Step Towards AI Integrity.: no description found
HuggingFace ā· #general (40 messagesš„):
-
GPU Choices and Model Pretraining Realities:
@sachinkannaujiya1998
seeks advice on which Hugging Face model to use with an RTX 3080 GPU, while@b1gb4ng
contemplates pretraining a 7b parameter model, only to reconsider due to the significant resources required as detailed by@asprtnl_50418
.@asprtnl_50418
suggests fine-tuning existing models as a cost-effective alternative, highlighting LoRA/QLoRA adapters and Unsloth for efficiency. -
Feature Extraction Techniques Explained:
@vipitis
clarifies that feature extraction often involves creating sequence embeddings with encoder-only models like BERT and its derivatives. The MTEB leaderboard is suggested for exploring state-of-the-art models. -
LLM Training and Eval Data Strategy:
@enka55
raises a practical question about splitting new, unique data between training and evaluation sets when teaching a language model about āsoldering,ā sparking advice from@the_aureo
on ensuring data diversity and considering models like RAG to supplement learning. -
Inquiry on HuggingFaceās Free Plan and UI Integrations:
@Ali_k
inquires about the capabilities of Hugging Faceās free plan concerning AI model training. Meanwhile,@Sebastian
seeks experiences withchat-ui
for integrating text-generation WebUI endpoints. -
Seeking Model Reviews Beyond Benchmarks:
@green_eye
voices frustration over navigating model choices based on benchmarks alone, expressing a desire for reviews that provide insight into model strengths, weaknesses, and contextual performance beyond mere numbers.
Links mentioned:
- Google Colaboratory: no description found
- Supported models and hardware: no description found
- MTEB Leaderboard - a Hugging Face Space by mteb: no description found
- meta-llama/Llama-2-7b Ā· Hugging Face): no description found
HuggingFace ā· #today-im-learning (1 messages):
- The Quest for a Data Set Evaluation Framework:
@rosebei3ngan3g
discussed the lack of comprehensive frameworks for evaluating datasets used in training large language models, questioning the best approach to take for such evaluations. This highlights a gap in current methodologies focusing mainly on model evaluation.
HuggingFace ā· #cool-finds (7 messages):
-
Diving Deep into HuggingFaceās Dataset Cards:
@andysingal
shared an intriguing GitHub project analyzing HuggingFace dataset cards. The project aims for a large-scale analysis of dataset documentations, contributing to the AI communityās understanding of dataset narratives and usage. -
Learning UD and Audio ML Adventures:
@pacificvoltage
embarked on a self-tutorial journey with an introductory book on universal design (UD) learning and found a fascinating Machine Learning Street Talk interview with Chomsky on YouTube, discussing the utilization of Deep Fake technology for interview repair. -
Novel LLM Text Detector Unveiled:
@tea3200
introduced a groundbreaking preprint titled āBinocularsā, which proposes a new large language model (LLM) detector that achieves impressive accuracy at identifying machine-generated text through contrast scoring. This technique showcases over 90% detection rate of generated samples at a negligible false positive rate. -
Flutter Pushes Into AI with ONNX:
@akindelemichael
highlighted a notable GitHub repository aiming to bridge ONNX runtime with Flutter, enabling the integration of ONNX models in Flutter apps across various platforms. This repo complements a growing trend of Flutterās use in AI applications, as noted by@osanseviero
in a follow-up. -
Flutter SDK for HuggingFace Inference APIs: Following the mention of Flutterās expanding role in AI,
@osanseviero
detailed a new project, the Flutter SDK for HuggingFace Inference APIs, which supports NLP APIs and underscores Flutterās potential for cross-platform AI application development. This development marks a significant step towards making AI more accessible and open source.
Links mentioned:
- Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text: Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrastinā¦
- @shivance on Hugging Face: āHi Community! Iām ecstatic to announce Flutter SDK for HuggingFace Inferenceā¦ā: no description found
- GitHub - gtbluesky/onnxruntime_flutter: A flutter plugin for OnnxRuntime provides an easy, flexible, and fast Dart API to integrate Onnx models in flutter apps across mobile and desktop platforms.: A flutter plugin for OnnxRuntime provides an easy, flexible, and fast Dart API to integrate Onnx models in flutter apps across mobile and desktop platforms. - GitHub - gtbluesky/onnxruntime_flutterā¦
- GitHub - mbzuai-nlp/SemEval2024-task8: SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection: SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection - GitHub - mbzuai-nlp/SemEval2024-task8: SemEval2024-task8: Multidomain, Multimodel and Multilingual Macā¦
- GitHub - YoungXinyu1802/HuggingFace-Dataset-Card-Analysis: Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on HuggingFace (ICLR 2024): Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on HuggingFace (ICLR 2024) - GitHub - YoungXinyu1802/HuggingFace-Dataset-Card-Analysis: Navigating Dataset Documentaā¦
HuggingFace ā· #i-made-this (12 messagesš„):
- WhisperSpeech launches multilingual TTS on HuggingFace:
@tonic_1
shared a new HuggingFace Space for WhisperSpeech, a demo allowing multi-language text to speech and voice print creation with minimal audio input. Expect more examples to arrive soon in this draft version. - Nemo Model Project Blog Post in the Works: Following
@tonic_1
ās interest in launching a Nemo model project,@not_lain
committed to writing a detailed blog post ASAP. This will cover using containers, providing a much-needed example with details for the community. - CheXRay Demos Facing a Runtime Error:
@tonic_1
reported a runtime error in their HuggingFace Space CheXRay, indicating work in progress on analyzing Chest X-Rays. The error highlights ongoing development in medical imaging AI applications. - Call for Increased Outreach via Community Blog Posts:
@lunarflu
suggests that sharing work through HuggingFace community blog posts could help in increasing reach, pointing@mateomd_dev
towards a potential opportunity for detailed technical sharing. The conversation hints at the growing importance of detailed community-driven content on HuggingFace. - Anticipation Builds for wav2vec2-bert Model Release:
@yehors
announced the publication of a new model, wav2vec2-bert, based on Common Voice 10, set to release tomorrow. This model promises enhancements in voice-based AI technologies.
Links mentioned:
- CheXRay - a Hugging Face Space by Tonic: no description found
- WhisperSpeech - a Hugging Face Space by Tonic: no description found
- blog-explorers (Blog-explorers): no description found
- Hugging Face ā Community Blogs: no description found
HuggingFace ā· #reading-group (3 messages):
-
Googleās Lumiere Boasts Impressive T2V Capabilities:
@fishie22
highlights Googleās Lumiere for its groundbreaking approach in T2V (Text-to-Video) models. Utilizing a Space-Time UNET that downsampled the signal on both space and time, Lumiere is capable of generating 80 frames at 16fps, presenting a potentially unparalleled temporal consistency in video generation. Hereās the comprehensive study detailing their method and results: Read the study. -
No Rush, Take Your Time!:
@lunarflu
conveys a message of patience and support towards Isamu, emphasizing a non-pressured, encouraging community vibe. -
Medium Article Earns Praise for Benchmarking:
@starsupernova
shared their enthusiasm for a Medium article related to benchmarking, describing it as āSuper greatā. Details on the contents of the article or its particular focus were not provided.
Links mentioned:
Lumiere: A Space-Time Diffusion Model for Video Generation: We introduce Lumiere ā a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion ā a pivotal challenge in video synthesis. To this end, we ā¦
HuggingFace ā· #diffusion-discussions (1 messages):
spikespiegel5112: How to load LoRA model in local?
HuggingFace ā· #computer-vision (5 messages):
-
Exploring Advanced LMM Techniques: User
besiktas
inquired whether there were specific reasons for choosing idefics/flamingo resampler/cross-attention over linear projection/pretrained vision encoder in the training of a new LMM. No direct response was provided in the chat. -
Swift API for Vision AI Introduced:
ahmed3ibrahim
shared his experience using the Swift API for vision AI, noting its ability to handle multiple images in one request. He provided a link to the API: Gemini Pro Vision AI and highlighted its features including 8.5/10 popularity, 5,846ms latency, 99% service level, and 100% health check. -
Inquiry About CVPR2024 Submissions: User
iloveh8
asked how to access all papers (both accepted and rejected) submitted for CVPR2024, and also inquired if anyone in the chat had submitted a paper. There were no responses provided in the chat regarding this question.
Links mentioned:
Gemini Pro Vision AI API Documentation (swift-api-swift-api-default) | RapidAPI: no description found
HuggingFace ā· #NLP (15 messagesš„):
- TorToiSe TTS Achieves New Heights in Quality:
@mr_nilq
highlighted TorToiSe TTS, available via Coqui, for its exceptional quality and consistency, despite being hindered by slow performance due to its composite architecture. A modified version achieving a 5x speed increase can be found here. - Choosing the Right Tools for Training AI on QA Binaries:
@ysk.dev
is exploring options for training an AI with approximately 10,000 question-and-answer pairs, contemplating between Amazon Lex and VDS. Concerns were raised about the adequacy of Colab Pro Plus for handling long answers and queries about suitable machine specifications for running the backend server. - Troubleshooting
TFTrainer
ImportError in Transformers:@srovnbh
faced an ImportError withTFTrainer
from thetransformers
package. Attempts to resolve the issue by switching between versions4.36.2
and4.37.1
oftransformers
proved futile, even with community assistance. - Upcoming Talk on Trusting āBlack Boxā Models:
@vipitis
shared a link to an upcoming talk about the trustworthiness of evaluating āblack boxā models, which are not openly accessible for inspection. The talk details are available here. - Compatibility Issues with Bits and Bytes on Windows:
@kingpoki
encountered issues using certain functionalities, ultimately discovering that the root cause was incompatibility with Windows. This serves as a reminder of the operating systemās impact on software applicability and functionality.
Links mentioned:
talks.cam : Replicating and auditing black-box Language Models.: no description found
HuggingFace ā· #diffusion-discussions (1 messages):
spikespiegel5112: How to load LoRA model in local?
HuggingFace ā· #gradio-announcements (1 messages):
- Gradio 4.16 Released with Exciting Features:
@abidlabs
announced the release ofgradio 4.16
, highlighting major updates including native support for Polars Dataframe, the ability to use the Gallery component as an input, faster streaming for low-latency chatbots, and auto-generated docs for custom components. This major update aims at enhancing the Gradio experience, and more details can be found in the changelog.
Links mentioned:
gradio/CHANGELOG.md at main Ā· gradio-app/gradio: Build and share delightful machine learning apps, all in Python. š Star to support our work! - gradio-app/gradio
LAION ā· #general (47 messagesš„):
- LAION2b-en Aesthetics Scores Unavailable:
@ppwwyyxx
asked where to download the LAION2b-en aesthetics scores, linking to a Hugging Face dataset that was disabled on the authorās request.@chad_in_the_house
confirmed the dataset is currently down and advised checking announcements for further updates. - The Quest for a Better Open Source Voice Chat Interface:
@jpcl_
shared news about releasing a demo for a complete voice chat interface utilizing Whisper and WhisperSpeech alongside an Open Source LLM, aiming for lower latency and a more natural conversation flow. They expressed a desire to improve the current LLM used (Dolphin 2.6 Phi-2) and invited collaboration to enhance the project, with details posted on Hacker News. - Looking for Teammates for VAās AI Tech Sprint:
@ninjaa2377
is seeking individuals or a small team to join for the VAās AI Tech Sprint, with the challenge focusing on Ambient Dictation for Clinical Encounter Notes, offering a $300K first prize. The competition aims to push AI capabilities in healthcare, and participants must be U.S. persons (AI Tech Sprint Challenge). - Debate Over Copyright Infringement and AI Models:
@pseudoterminalx
discussed operating in a copyright infringement haven, critiquing entities that underestimate local government autonomy and resilience to foreign influence. They shared insights into local practices, including the use of pirated U.S. channels by cable companies, without mentioning a specific location. - Discussion on Utilization of LLMs Beyond Entertainment:
@SegmentationFault
lamented the prevalent use of local LLMs for entertainment rather than productive means, sparking a discussion on the dominance of OpenAI and the desire for more practical applications. The conversation touched on historical human behavior and the future of LLM usage.
Links mentioned:
- no title found: no description found
- laion/laion2B-en-aesthetic at main: no description found
- Reddit - Dive into anything: no description found
- European Commission šŖšŗ on Instagram: āNice try Fluffy, but indeed you got the news right! Today we presented measures to allow European AI start-ups and SMEs to train their model using our High-Performance Computingās capacity.: 87K likes, 672 comments - europeancommission on January 24, 2024: āNice try Fluffy, but indeed you got the news right! Today we presented measures to allow Europeā¦ā
- Challenge.Gov: Challenge.Gov is the official GSA government website supporting prize challenges and prize competitions that are sponsored by the US federal government. Here federal agencies provide prize awards to ā¦
LAION ā· #research (38 messagesš„):
- Byte-Level Transformers Garner Optimism:
@marianbasti
expressed cautious optimism for byte-level transformers after reviewing a research paper, potentially indicating a significant shift in transformer model capabilities. - Innovations in Text-to-Image Diffusion and ID Preservation:
@vrus0188
shared two groundbreaking projects: RPG-DiffusionMaster for mastering text-to-image diffusion and InstantID for state-of-the-art tuning-free ID-Preserving generation, showcasing rapid advancements in AI image generation techniques. - AI Image Generation Far From Limits: According to
@vrus0188
, the weekly emergence of 4-6 new papers or tools demonstrates that AI image generation technology is nowhere near its theoretical limits, debunking any skepticism regarding the fieldās potential for growth. - Challenges and Costs in Entering AI Fields:
@chad_in_the_house
and others discussed the evolving barriers to entry in fields like AI image generation, from the practicality of undertaking such ventures to the significant costs associated with training models such as Stable Diffusion, with estimates ranging between ā¬500 to ā¬180,000. - Exploration of Biologically-Inspired Simulated Language Acquisition: A recent paper discussed by
@chad_in_the_house
presents a biologically plausible language organ model that learns language without backpropagation, signifying potential shifts away from traditional machine learning approaches toward more efficient and biologically-inspired methodologies.
Links mentioned:
- The Architecture of a Biologically Plausible Language Organ: We present a simulated biologically plausible language organ, made up of stylized but realistic neurons, synapses, brain areas, plasticity, and a simplified model of sensory perception. We show througā¦
- PALP: Prompt Aligned Personalization of Text-to-Image Models: Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image tā¦
- GitHub - YangLing0818/RPG-DiffusionMaster: Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG): Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG) - GitHub - YangLing0818/RPG-DiffusionMaster: Mastering Text-to-Image Diffusion: Recaptioning, Plā¦
- GitHub - InstantID/InstantID: InstantID : Zero-shot Identity-Preserving Generation in Seconds š„: InstantID : Zero-shot Identity-Preserving Generation in Seconds š„ - GitHub - InstantID/InstantID: InstantID : Zero-shot Identity-Preserving Generation in Seconds š„
LlamaIndex ā· #announcements (1 messages):
- Webinar Alert on LLMCompiler:
@jerryjliu0
announces a last-minute webinar with authors Sehoon Kim and Amir Gholami discussing the LLMCompiler, a new framework for parallel function calling in agents. The webinar aims to cover the benefits of this compiler over previous sequential reasoning frameworks like ReAct, emphasizing long-term planning and parallelization capabilities. Links to the LLMCompiler paper and related resources were shared: LLMCompiler paper, LlamaPack, and a GitHub notebook.
Links mentioned:
LlamaIndex Webinar: Efficient Parallel Function Calling Agents with LLMCompiler Ā· Zoom Ā· Luma: LLMs are great at reasoning and taking actions. But previous frameworks for agentic reasoning (e.g. ReAct) were primarily focused on sequential reasoning, leading to higherā¦
LlamaIndex ā· #blog (7 messages):
-
Building a Slack Bot with @seldoās Guide: A new Open Source Software (OSS) repository was announced, featuring a step-by-step guide by
@seldo
on how to build a@SlackHQ
bot that learns from conversations. The guide can be found in a tweet by LlamaIndex. -
LlamaIndex Partners with Zilliz Universe: LlamaIndex announced a partnership with
@zilliz_universe
to integrate the Zilliz Cloud Pipeline into LlamaIndex, offering a scalable retrieval service with multi-tenancy support. Read more in their guest post. -
Day 0 Support for @OpenAIās Embedding Models Announced: LlamaIndex released version v0.9.38 with Day 0 support for @OpenAIās latest embedding models. Details about this release were shared in a tweet.
-
LlamaIndex Promises Effortless Prompting: LlamaIndex emphasizes their feature of providing effective prompting so users donāt have to struggle with customization, although customization is still an option. Further details can be found in a recent tweet.
-
TypeScript Support with LlamaIndex.TS: A new TypeScript version, LlamaIndex.TS version 0.1.0, was announced with support for @OpenAIās latest embeddings, thanking
@yi_ding
for the quick implementation. Additionally, @qdrant_engine support is included in this update as detailed in a follow-up tweet.
Links mentioned:
- llama-index: Interface between LLMs and your data
- Building Scalable RAG Applications with LlamaIndex and Zilliz Cloud Pipelines: Introduction
LlamaIndex ā· #general (38 messagesš„):
-
No LLM for TextGenerationInference in LlamaIndex: User
@wizboar
queried if LlamaIndex has a LLM for the TextGenerationInference server, to which@cheesyfishes
confirmed that there isnāt one available currently, suggesting that the langchain tool works with the wrapper instead. -
Customizing Chat Engine with
similarity_top_k
:@richard1861
inquired about the possibility of defining a specific parameter (similarity_top_k=3
) for the Chat Engine in LlamaIndex, and@whitefang_jr
provided a code snippet to demonstrate how to configure it directly in the engine. -
Implementation Challenges in Insurance Domain Queries: User
@lancerninja
detailed a complex use case involving query rewriting in the insurance domain when similar terms are used instead of exact matches, while@cheesyfishes
suggested using an llm for query rewriting but noted the absence of an offline solution for their specific scenario. -
Excitement Over OpenAIās New Embedding Models:
@ayfri
shared a link to OpenAIās announcement of new embedding models and API updates, expressing anticipation for upcoming support, to which@cheesyfishes
reassured that support would be released shortly, highlighting the communityās enthusiasm for continuous improvements. -
Guidance on Custom Prompts for Extending Context in Queries:
@shri_j
sought advice for querying information not included in provided context documents using LlamaIndex and OpenAI, with@cheesyfishes
recommending the modification of default prompts to instruct the model to consider or extend beyond the given context, linking to specific documentation for assistance.
Links mentioned:
- New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.
- Usage Pattern - LlamaIndex š¦ 0.9.39: no description found
LlamaIndex ā· #ai-discussion (5 messages):
-
Zep Enhances Chatbots with Production-grade Tools: User
@yoursaviorjesus
highlighted Zep, showcasing its capabilities for production-grade chat history memory, vector search, data enrichment, and more. They specifically questioned the efficacy of its entity extraction capabilities. -
LlamaIndex Versus Amazon Kendra:
@zeekg_46676
asked whether LlamaIndex is a vector store or functions more like Amazon Kendra which performs natural language search. -
LlamaIndex: A Flexible Data Orchestration Tool: In response to
@zeekg_46676
,@cheesyfishes
clarified that LlamaIndex is closer to Amazon Kendra, capable of utilizing any vector store, LLM, or embedding model to manage data ingestion, retrieval, and response synthesis. -
Self-learning RAG with Automated Knowledge Graph Creation:
@chiajy
shared a link to a Substack article (Harry Potter and the Self-Learning Knowledge Graph RAG Workflow) detailing a demo from WhyHow.AI showcasing recursive retrieval, automated knowledge graph creation, and memory/multi-hop reasoning using RAG on Harry Potter book chapters. This demo aimed to improve RAG accuracy, reduce time to production, and show one of the first examples of a self-learning RAG.
Links mentioned:
- Harry Potter and the Self-Learning Knowledge Graph RAG: WhyHow.AIās self-learning RAG with knowledge graphs to bring accuracy and rules to Vertical AI - demonstrating recursive retrieval, memory, automated context-aware knowledge graph construction.
- Zep - Fast, scalable building blocks for LLM apps: no description found
Latent Space ā· #ai-general-chat (36 messagesš„):
- LLM Paper Club Recording Policies:
@kbal11
mentioned that the LLM Paper Club sessions have not been recorded to allow participants to share more about their work without fear of internet exposure. Thus, no replay is available for missed sessions. - In Search of RAG Evaluation Tools:
@joejoej0e
is looking for tools for experiment tracking and hand-rating the results of RAG pipelines, aimed at improving information retrieval products by human evaluation of relevance. - Introduction of MORPHEUS-1 by Prophetic AI:
@shivdinho
shared a link announcing MORPHEUS-1, claimed as the worldās first multi-modal generative ultrasonic transformer for inducing and stabilizing lucid dreams, set for beta release in Spring 2024. - Rapid Development at go-go-golems:
@slono
announced the development of 5k lines of code in just 4 days as part of yaml-custom-tags experiment, showcasing the pace and voluminous productivity of the project. - Martian Launches LLM Evaluation Tool:
@cute_hamster_07119
introduced a new tool by Martian that evaluates live which LLM inference product to use, based on cost, throughput, and TTFT, for various providers like Anyscale, Together, Lepton, Fireworks, etc., with the tool being launched today.
Links mentioned:
- šØšØ Thatās a lot of YAML šØšØ: no description found
- Tweet from Prophetic (@PropheticAI): INTRODUCING MORPHEUS-1 The worldās first multi-modal generative ultrasonic transformer designed to induce and stabilize lucid dreams. Available for beta users Spring 2024
- New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.
- KREA is building the next frontier of human creativity ā”ļø: Plus: Co-founder Diego on embracing curiosity and chaosā¦
- Reddit - Dive into anything: no description found
- LLM Inference Provider Leaderboard: A live, unbiased benchmark on LLM inference APIs made by Martian
- Evaluation Methodology - Provider Leaderboard: no description found
- Tweet from talrid23 (@talrid23): JSON format is becoming the de facto standard for output generation with LLM. However, is it the optimal format ? š¤ We claim that not - YAML outputs are shorter and simpler, leading to faster infereā¦
- go-go-labs/cmd/experiments/yaml-custom-tags at main Ā· go-go-golems/go-go-labs: GO GO EXPERIMENTAL LAB. Contribute to go-go-golems/go-go-labs development by creating an account on GitHub.
- Reproducibility - Provider Leaderboard: no description found
- GitHub - withmartian/provider-dashboard: Open sourced backend for Martianās LLM Inference Provider Leaderboard: Open sourced backend for Martianās LLM Inference Provider Leaderboard - GitHub - withmartian/provider-dashboard: Open sourced backend for Martianās LLM Inference Provider Leaderboard
Latent Space ā· #ai-event-announcements (1 messages):
-
LLM Paper Club Asia Launches:
@ivanleomk
announced the kick-off of the first LLM Paper Club session in Asia focused on discussing the seminal paper āAttention Is All You Needā. Interested participants can sign up for future events here and join todayās session via Discord. -
Stay Updated with Latent.Space Events: Participants were encouraged to stay informed about new Latent.Space events by clicking the RSS logo just above the calendar on the right to add to their calendar with āAdd iCal Subscriptionā on hover, ensuring they donāt miss future gatherings.
Links mentioned:
- LLM Paper Club (Asia Edition!) Ā· Luma: UPDATE: Updated with a link to the discord stage that weāll be using Asia-timezone friendly version of the Latent.Space x EugeneYan.com LLM Paper Club! This week weāll be covering theā¦
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Latent Space ā· #llm-paper-club (8 messagesš„):
-
Asia Paper Club Scheduled a New Session:
@ivanleomk
announced that next weekās Asia paper club will tentatively cover the Self-Rewarding Language Models Paper. They encouraged members to suggest different papers or express interest in presenting. -
A Call for Feedback:
@aimuggle
thanked participants for joining the session and invited feedback to improve the still beta-phase program. -
Question About Self-Reward:
@stealthgnome
inquired whether self instruct is the input for self-reward in the context of the upcoming discussion. -
US Paper Clubās Next Feature: In response to
@ivanleomk
ās query about the US paper clubās next agenda,@eugeneyan
shared that they will discuss the Pythia paper with a comprehensive list of contributing authors highlighted.
Links mentioned:
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling: How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16ā¦
DiscoResearch ā· #mixtral_implementation (2 messages):
- Mergekit Options for Mixtral Discussed:
@philipmay
shared a GitHub issue link from mergekitās author clarifying options for finetuning Mixtral models post-merge, questioning the effectiveness of the āhiddenā and ārandomā options for later finetuning. - Auxiliary Loss in MoE Training Highlighted:
@bjoernp
responded positively to the shared information, emphasizing that getting the auxiliary loss right is a crucial aspect of MoE training.
Links mentioned:
Mixtral branch: What option should I choose when I want to do some finetuning after the merge? Ā· Issue #116 Ā· cg123/mergekit: The parameter description of āhiddenā and ārandomā does not exactly explain what to do when I want to finetune later. Is it even useful (possible) to finetune after merging with &qā¦
DiscoResearch ā· #general (23 messagesš„):
-
Quality Data Selection Dilemma:
@bjoernp
shared an interesting paper suggesting that filtering pretraining data for āqualityā might not always enhance model performance. The paper proposes a framework for dataset selection that optimizes model performance rather than adhering to conventional notions of data quality. -
Exploring KTO Training:
@bjoernp
and@hammadkhan
discussed Kahneman-Tversky Optimisation (KTO), comparing it to Direct Preference Optimization by considering binary signals of desirable vs undesirable completions. They highlighted benefits and implementation aspects, including Hugging Faceās documentation and the suitability of Axolotl for KTO via trl. -
Contextual AI Approaches with KTO:
@hammadkhan
introduced Kahneman-Tversky Optimisation (KTO) as a novel method using good and bad examples (e.g., š or š) for training. This method, detailed in a blog post, is positioned as a simpler and potentially more effective strategy for updating chat models in production environments. -
Debate on KTOās Applicability:
@rasdani
and@hammadkhan
debated the efficacy of KTO in scenarios requiring comparisons between instruction-led good and bad answers. The discussion pivoted around whether labels need to be directly correlated or if KTOās flexibility allows for broader usage compared to Direct Preference Optimization (DPO). -
GPT4-Turbo and GPT3.5 Pricing Update:
@rasdani
highlighted the latest updates including the launch of Embedding V3 models, GPT-4 Turbo, and a significant price reduction for GPT-3.5 Turbo. Details were shared in an official tweet by @OfficialLoganK and the OpenAI blog.
Links mentioned:
- DsDm: Model-Aware Dataset Selection with Datamodels: When selecting data for training large-scale models, standard practice is to filter for examples that match human notions of data quality. Such filtering yields qualitatively clean datapoints that intā¦
- DPO Trainer: no description found
- Tweet from Logan.GPT (@OfficialLoganK): Great news for @OpenAIDevs, we are launching: - Embedding V3 models (small & large) - Updated GPT-4 Turbo preview - Updated GPT-3.5 Turbo (*next week + with 50% price cut on Input tokens / 25% price ā¦
DiscoResearch ā· #embedding_dev (12 messagesš„):
-
German Jina Model Announcement:
@sebastian.bodza
highlighted the upcoming release of the ājinaai/jina-embeddings-v2-base-deā model on Hugging Face, potentially benefiting ranking tasks. No specific release date was mentioned, but it is expected ātomorrow.ā -
Question Generation Examples Shared:
@sebastian.bodza
shared examples of question generation on GitHub, indicating the commencement of work in this area. -
Mixtral and LLM Usage for Embeddings: In response to
@philipmay
ās inquiry,@sebastian.bodza
mentioned using Mixtral in 4-bit GPTQ with VLLM for their project, focusing on innovative question generation and embedding development. -
New OpenAI Embeddings Highlighted: Bjoernp brought to attention new embedding models and API updates from OpenAI, which could be advantageous for multilingual support and potentially influence future development strategies.
-
Genie Method for High-Quality Data Generation: Bjoernp shared an arXiv paper proposing Genie, a novel method for generating high-quality, content-grounded data, suggesting its potential utility in improving question-answer pairs or summaries through automated filtering mechanisms for quality assurance.
Links mentioned:
- Genie: Achieving Human Parity in Content-Grounded Datasets Generation: The lack of high-quality data for content-grounded generation tasks has been identified as a major obstacle to advancing these tasks. To address this gap, we propose Genie, a novel method for automatiā¦
- GitHub: Letās build from here: GitHub is where over 100 million developers shape the future of software, together. Contribute to the open source community, manage your Git repositories, review code like a pro, track bugs and feaā¦
- Embedding_Training/README.md at main Ā· SebastianBodza/Embedding_Training: Contribute to SebastianBodza/Embedding_Training development by creating an account on GitHub.
- New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.
DiscoResearch ā· #discolm_german (5 messages):
- Effective Finetuning Achieved:
@thomasrenkert
reported successfully finetuning DiscoLM German 7B v1 with unsloth. They are excited for the future DiscoLM German-version based on Mixtral-Instruct, highlighting the significant impact of MoE. - Unique Dataset for Translation: In response to
@hammadkhan
ās query,@thomasrenkert
shared that they finetuned the model on their own dataset for translating Middle High German to Modern German. - Community Support and Interest:
@bjoernp
showed support and appreciation for@thomasrenkert
ās update on their finetuning success and dataset, indicating a positive community interaction and interest in novel applications of DiscoLM.
LLM Perf Enthusiasts AI ā· #embeddings (2 messages):
-
OpenAI Unveils New Embedding Models and API Enhancements: User
@potrock
shared a blog post update from OpenAI announcing the launch of a new generation of embedding models, improved GPT-4 Turbo and moderation models, new API usage management tools, along with upcoming lower pricing on GPT-3.5 Turbo. The announcement highlighted two new embedding models, updated GPT-4 Turbo and GPT-3.5 Turbo models, a new text moderation model, and a commitment that data sent to OpenAI API will not be used for training their models. -
Link Redirection Suggestion:
@shacrw
suggested that the announcement regarding the new embedding models and API updates should have been posted in a more appropriate channel, providing a link to redirect the conversation. This implies a miss in communication channels but does not provide further context on the discussion.
Links mentioned:
New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.
LLM Perf Enthusiasts AI ā· #announcements (1 messages):
mat_mto: Thanks Jeff! love all the work youāre doing so far
LLM Perf Enthusiasts AI ā· #openai (16 messagesš„):
-
OpenAI Launches New Models and Tools:
@potrock
shared a blog post from OpenAI announcing new embedding models, updates to GPT-4 Turbo, new moderation models, tools for managing API usage, and upcoming lower pricing for GPT-3.5 Turbo. The updates aim to offer better performance and cost-efficiency for developers. -
Community Excitement Around Embeddings Update: Following the announcement,
@nosa_.
expressed intrigue with a āwell well wellā, and@potrock
highlighted the appeal of the shortened embeddings feature, emphasizing its cool factor. -
Comparison of Embedding Models:
@potrock
noted that the new large embedding model slightly surpasses the performance of bge-large, indicating subtle but notable improvements in the latest iteration from OpenAI. -
Advantages of Upgrading to OpenAIās New Offerings:
@res6969
shared plans to upgrade their system to incorporate the newly announced embedding models, citing no longer seeing a need to switch to open-source options considering the simplicity and effectiveness of sticking with OpenAIās solutions. -
Exploring Cost-Efficiency of New Embeddings:
@shacrw
and@michelcarroll
discussed the potential cost benefits of using the newer, larger embedding model (v3-large) with dimension shortening, considering its performance and impact on storage and API costs. They pondered the balance between embedding costs and savings on vector database storage, with@michelcarroll
leaning towards the v3-large model for better performance and reduced storage costs.
Links mentioned:
New embedding models and API updates: We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.
LangChain AI ā· #general (12 messagesš„):
- Welcome Aboard, quarknova!: User
@quarknova
, an ENS student interning at INRIA, is looking into using LangChain for their project. They are curious about whether the GitHub version will suffice or if the commercial version is necessary. - Creating AI Personalities Explored:
@jstansbe
inquired about creating āAI Personalitiesā like an Elon Musk AI without relying on external APIs.@ksolo
responded by mentioning the process is known as finetuning and recommended a short course on finetuning large language models for practical guidance. - Rapid Web-Search Chatbot Development with LangChain and Streamlit:
@johnnucleus
shared excitement over quickly creating a chatbot capable of web searches using LangChain and Streamlit, highlighting the efficiency and ease of use. - Synthetic Data Generation for ML Training with LLMs:
@rajib2189
and@johnny2x2
discussed the application of Large Language Models (LLMs) for generating synthetic data for traditional machine learning training and RAG generation, respectively. - PARQUET File Loading Into LangChain:
@benjaminbascary
sought advice on loading PARQUET files as documents within LangChain, with@johnny2x2
providing a solution for reading with pandas and loading through theDataFrameLoader
from the LangChain community document loaders.
Links mentioned:
Finetuning Large Language Models: no description found
LangChain AI ā· #langserve (3 messages):
-
Exploring LangServe Examples:
@veryboldbagel
directed users to explore LangServe examples on GitHub, highlighting two agent examples in the readme and a third not listed but available at another link. Those interested can find more information and the examples here and here. -
Guidance on Constructing Custom Agents:
@veryboldbagel
provided detailed advice for users looking to define custom tools or create custom agents, mentioning that an off-the-shelf OpenAI tools agent is sufficient for custom tools. For more complex needs,@veryboldbagel
recommended using LCEL and LangGraph for a more expressive power, with further instructions available here and here. -
Issue with Agent and Stream Responses:
@hiranga.g
encountered an issue when implementing an agent with history where they did not receive a stream response as expected, contrasting with JSON object delivery in the playground. Exploring solutions, they attempted usingchain.streamLog()
based on suggestions related to a bug with agents and LangServe, but without success details provided on the resolution attempt or outcome.
Links mentioned:
- š¦šøļøLangGraph | š¦ļøš Langchain: ā” Building language agents as graphs ā”
- GitHub - langchain-ai/langserve: LangServe š¦ļøš: LangServe š¦ļøš. Contribute to langchain-ai/langserve development by creating an account on GitHub.
- langserve/examples/configurable_agent_executor at main Ā· langchain-ai/langserve: LangServe š¦ļøš. Contribute to langchain-ai/langserve development by creating an account on GitHub.
LangChain AI ā· #share-your-work (2 messages):
-
Querying about Context Awareness: User
dejoma
inquired if LangChain AI uses the context from the currently visited webpage, indicating curiosity about the AIās context awareness capabilities. -
Innovative SQL Use with LangChain for Manufacturing:
johnny2x2
shared their experience of implementing LLM automation in a manufacturing context to analyze late customer orders using ERP SQL data with Mixtral 7B v2 5Q 32K. They found success with a methodology that includes creating curated views in the database for the SQL chain to manage large databases efficiently and transitioning to using SQL queries as tools within a specialized task loop for more effective results.
Datasette - LLM (@SimonW) ā· #llm (3 messages):
-
Major LLM Release Announced:
@simonw
revealed a forthcoming LLM update aiming to significantly upgrade the underlying openai library. Enthusiasts are encouraged to test the new version, with instructions available on GitHub. -
Peek into the Future with 0.13 Milestone: For those curious about what the update entails,
@simonw
shared a link to the 0.13 Milestone on GitHub, providing insights into upcoming enhancements. -
Call for Help on Readline Issues:
@simonw
is seeking assistance for resolving a bug related to readline problems in LLM chat, where arrow keys output ANSI codes instead of navigating text. Contributors can find more details on this issue at GitHub.
Links mentioned:
- 0.13 Milestone Ā· simonw/llm: Access large language models from the command-line - 0.13 Milestone Ā· simonw/llm
- llm chat - readline problems still present Ā· Issue #376 Ā· simonw/llm: When I open llm chat, I expect that using the left and right arrow keys will navigate the cursor but instead I get nasty ANSI codes printed to the screen. $ llm chat Chatting with gpt-4 Type āexitā¦
- Upgrade for compatibility with OpenAI 1.0 library Ā· Issue #325 Ā· simonw/llm: Currently: Successfully installed openai-1.0.1 $ llm -m gpt-4-turbo āhiā Error: module āopenaiā has no attribute āChatCompletionā