Onetime IRL callout: If youāre in SF, join Dylan Patel (aka āthat semianalysis guyā who wrote the GPU Rich/Poor essay) for a special live Latent Space special event tomorrow. Our first convo was one of last yearās top referenced eps.
As hinted last year, HuggingFace/BigCode has finally released StarCoder v2 and The Stack v2. Full technical report here.
StarCoder 2: SOTA for size (3B and 15B)
StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages fromĀ The Stack v2, with opt-out requests excluded. The model usesĀ Grouped Query Attention,Ā a context window of 16,384 tokensĀ withĀ a sliding window attention of 4,096 tokens, and was trained using theĀ Fill-in-the-Middle objectiveĀ on 4+ trillion tokens.
Since it was only just released, best source on evals is BigCode for now:

The Stack v2: 10x bigger raw, and 4.5x bigger deduped (900B Tokens)

We are experimenting with removing Table of Contents as many people reported it wasnāt as helpful as hoped. Let us know if you miss the TOCs, or theyāll be gone permanently.
AI Twitter Summary
AI and Machine Learning Discussions
- FranƧois Chollet remarks on the nature of LLMs, emphasizing that output mirrors the training data, capturing human thought patterns.
- Sedielem shares extensive thoughts on diffusion distillation, inviting community feedback on the blog post.
- FranƧois Chollet differentiates between current AI capabilities and true intelligence, focusing on the efficiency of skill acquisition.
- Stas Bekman raises concerns about the ML communityās dependency on a single hub for accessing weight copies, suggesting the need for a backup hub.
Executive Shifts and Leadership
- Saranormous highlights leadership change at $SNOW, welcoming @RamaswmySridhar as the new CEO and applauding his technical and leadership expertise.
Technology Industry Updates
- DeepLearningAI rounds up this weekās AI stories, including Gemini 1.5 Proās rough week, Groq chipsā impact on AI processing speed, and a discussion on version management in AI development by @AndrewYNg.
- KevinAFischer celebrates his feature in Tech Crunch as an early user of the Shader app by @shaderapp and @daryasesitskaya.
Innovation and Technical Insights
- Andrew N Carr discusses the potential of fitting 120B parameter models on consumer GPUs as per the 1.58 Bit paper, emphasizing breakthroughs in VRAM efficiency.
- Erhartford highlights a real-time EMO lip sync model, suggesting its integration for innovative applications.
Memes/Humor
- C_valenzuelab draws a humorous analogy, stating that āairplanes didnāt disrupt the bicycle marketā.
- KevinAFischer jokes about the economics of using LLMs, poking fun at the current state of AI development.
- KevinAFischer makes a light-hearted comment about ideas being ahead of their time.
Miscellaneous Observations
- Margaret Mitchell questions diversity in news coverage of the Gemini fiasco - 2806 impressions
- Kevin Fischer humorously touches on repeating himself - 732 impressions
- Zach talks about the need for fair tax rates for the wealthy - 492 impressions
AI Development and Infrastructure
- abacaj mentions the need for backing up weights following an HF outage - 1558 impressions
- Together Compute announces the launch of OLMo-7B-Instruct API from @allen_ai - 334 impressions
- A discussion on the ternary BitNet paperās potential to revolutionize model scalability - 42 impressions
AI Twitter Narrative
The technical and engineer-oriented Twitter ecosystem has been buzzing with significant discussions spanning AI, blockchain, leadership transitions in tech, and some light-hearted humor.
Regarding AI and Machine Learning, FranƧois Cholletās reflection on LLMs as mirrors to our inputs, alongside Daniele Grattarolaās deep dive into diffusion distillation, underscore critical thinking about the essence and future of AI technologies. Reinforcing the importance of diversified safeguarding of machine learning models, Stas Bekmanās proposition for a secondary hub for model weights has caught the communityās attention, highlighting the communityās resilience in facing practical challenges.
In the leadership and innovation arena, the leadership transition at $SNOW garnered significant engagement, reflecting the continuous evolution and admiration for leadership within tech organizations.
Humor and memes remain a vital part of the discourse, with tweets like Cristóbal Valenzuelaās observation about the non-competition between airplanes and bicycles bringing a light-hearted perspective to innovation and disruption.
On various miscellaneous observations, Margaret Mitchellās call for more diverse perspectives in tech reporting highlights the importance of inclusivity and varied viewpoints in shaping our understanding of tech events.
Lastly, discussions around AI development and infrastructure saw practical considerations taking the forefront, as noted by abacajās preparation for possible future outages by backing up model weights. This operational resilience mirrors the broader strategic resilience seen across the technical and engineering community.
PART 0: Summary of Summaries of Summaries
ChatGPT Model Evaluations and Data Integrity on TheBloke Discord
- Detailed ChatGPT Model Comparisons: Members critically evaluated ChatGPT models, including GPT-4, Mixtral, and Miqu, focusing on API reliability and comparative performance. Specific concerns were raised about training data contamination from other AI outputs, potentially degrading model quality and reliability.
Technological Innovations and AI Deployment on Mistral Discord
- NVIDIA RAG Technical Limitations: NVIDIA's demo, showcasing retrieval-augmented generation (RAG), was critiqued for its 1024 token context limit and response coherence issues. The critique extended to NVIDIA's implementation choices, including the use of LangChain for RAG's reference architecture, hinting at broader discussions on optimizing AI model architectures for better performance.
Qualcomm's Open Source AI Models on LM Studio Discord
- Qualcomm's Contribution to AI Development: Qualcomm released 80 open source AI models on Hugging Face, targeting diverse applications in vision, audio, and speech technologies. Notable models include "QVision" for image processing, "QSpeech" for audio recognition, and "QAudio" for enhanced sound analysis. These models represent Qualcomm's push towards enriching the AI development ecosystem, offering tools for researchers and developers to innovate in machine learning applications across various domains. The release was aimed at fostering advancements in AI modeling and development, specifically enhancing capabilities in vision and audio processing, as well as speech recognition tasks.
These updated summaries provide a more focused view on the specific areas of interest and discussion within the respective Discord communities. They highlight the depth of technical scrutiny applied to AI models, the identification of performance limitations and potential improvements in AI technologies, and the specific contributions of Qualcomm to the open-source AI landscape, underlining the continuous evolution and collaborative nature of AI research and development.
PART 1: High level Discord summaries
TheBloke Discord Summary
- Spam Alert in General Chat: Users reported a spam incident involving
@kquant, with Discordās spam detection system flagging his activity after excessively contacting over 100 people with identical messages. - ChatGPT Variants Under Scrutiny: Diverse experiences with ChatGPT models were discussed, including GPT-4ās API reliability and comparisons with Mixtral or Miqu models. Concerns were raised over training data contamination from other AI outputs, potentially compromising quality.
- Mixed Results in Model Mergers: Dialogue highlighted the uncertainty in model merging outcomes, emphasizing the role of luck and model compatibility. Merging tactics such as spherical linear interpolation (slerp) or concatenation were suggested in the specialized channels.
- Innovative Roleplay with LLMs: Techniques to enhance character consistency in role-play involve using detailed backstories and traits for LLMs. Specific models like Miqu and Mixtral were favored for these tasks, though longer context length could reduce coherence.
- Pacing AI Training and Fine-tuning: Users exchanged training tips, including using Perplexity AI and efficient methods like QLoRA to curb hardware demand. The importance of validation and deduplication was stressed, alongside managing model generalization and hallucination.
Links to consider:
- For looking into detailed personalities and character backstories in AI role-play, one might explore the strategy explanations and datasets at Hugging Face.
- Searching for efficient training techniques could lead AI engineers to MAXās announcement about their platform aimed at democratizing AI development via an optimized infrastructure, detailed in their Developer Edition Preview blog post here.
Mistral Discord Summary
-
NVIDIAās Demo Faces Criticism for RAG Implementation: The NVIDIA āChat with RTXā demo showcasing retrieval-augmented generation (RAG) faced criticism for limiting context size to 1024 tokens and issues with coherent responses. Discussions hinted at concerns with NVIDIAās use of LangChain in RAGās reference architecture.
-
Mistral AI Discussions Span Licensing to Open Weights and Hardware Requirements: Conversations touched on Mistral AIās use of Metaās LLaMa model, anticipation for future open weight models following Mistral-7B, and hardware requirements for running larger models, like Mistral 8x7B, which may need at least 100GB of VRAM. Users considered the use of services like Together.AI for deployment assistance.
-
Model Quantization and Deployment Discussions Highlight Constraints: Technical discussions included constraining Mistral-7B to specific document responses, the stateless nature of language models, and the limitations of quantized models. Quantization reducing parameter counts for Mistral-7B and the necessity for large VRAM for full precision models were underscored.
-
Mistral Platform Intricacies and Function Calling Discussed: Users shared experiences and obstacles with Mistral function calls and reported on the necessity for specific message role orders. Some referred to the use of tools like Mechanician for better integration with Mistral AI.
-
Educational Tools and the Potential of Specialized Models: One user showcased an app for teaching economics using Mistral and GPT-4 AI models, while discussions touched on the specialized training of models for tasks like JavaScript optimization. An expressed need for improved hiring strategies within the AI industry surfaced among chats.
The conversations reveal technical discernment among the users, highlighting both enthusiasm for AIās advancements and practical discussions on AI model limitations and ideal deployment scenarios.
OpenAI Discord Summary
-
Loader Showdown: lm studio vs oobabooga and Jan dot ai: lm studio was criticized for requiring manual GUI interaction to kickstart the API, making it a non-viable option for automated website applications, prompting engineers to suggest alternatives oobabooga and Jan dot ai for more seamless automation.
-
AI Moderation and OpenAI Feedback: A message removed in a discussion about Copilot AI due to automod censorship led to suggestions to report to Discord mods and submit feedback directly through OpenAIās Chat model feedback form, with community members discussing the extent of moderation rules.
-
Mistralās Power and Regulation Query: The Mistral model, known for its powerful, uncensored outputs was compared to GPT-4, resulting in a conversation about the impact of European AI regulation on such models. A related YouTube video was shared, illustrating how to run Mistral and its implications.
-
Advancing Chatbot Performance: Enhancing GPT-3.5-Turbo for chatbot applications sparked a debate on achieving performance on par with GPT-4, with users discussing fine-tuning techniques and suggesting utilizing actual data and common use cases for improvement.
-
AI Certification vs. Real-world Application: For those seeking AI specialization, the community highlighted the primacy of hands-on projects over certifications, recommending learning resources such as courses by Andrew Ng and Andrej Karpathy, available on YouTube.
LM Studio Discord Summary
Model Compatibility Queries Spark GPU Discussions: Engineers engaged in detailed explorations of LLMs, such as Deepseek Coder 6.7B and StarCoder2-15B, and their compatibility with Nvidia RTX 40 series GPUs, discussing optimization strategies for GPUs like disabling certain features on Windows 11. A focus on finding the best-fitting models for hardware specifications was observed, underlined by the launch news of StarCoder2 and The Stack v2, with mentions of LM Studioās compatibility issues, especially on legacy hardware like the GTX 650.
Hugging Face Outage Disrupts Model Access: An outage at Hugging Face caused network errors for members trying to download models, affecting their ability to search for models within LM Studio.
Qualcomm Unveils 80 Open Source Models: Qualcomm released 80 open source AI models on Hugging Face, targeting vision, audio, and speech applications, potentially enriching the landscape for AI modeling and development.
LLM Functionality Expansions: Users exchanged insights on enhancing functionalities within LM Studio, such as implementing an accurate PDF chatbot with Llama2 70B Q4 LLM, seeking guidance on adding image recognition features with models like PsiPi/liuhaotian_llava-v1.5-13b-GGUF/, and expressing desires for simplified processes in downloading vision adapter models.
Hardware Hubris and Hopes: Discussions thrived around user experiences with hardware, from reminiscing about older GPUs to sharing frustrations over misrepresented specs in an e-commerce setting. One user advised optimizations for Windows 11, while TinyCorp announced a new hardware offering, TinyBox, found here. There was also speculation about the potential for Nvidia Nvlink / SLI in model training compared to inference tasks.
HuggingFace Discord Summary
-
Cosmopediaās Grand Release: Cosmopedia was announced, a sizable synthetic dataset with over 25B tokens and 30M files, constructed by Mixtral. It is aimed at serving various AI research needs, with the release information accessible through this LinkedIn post.
-
Hugging Face Updates Galore: The
huggingface_hublibrary has a new release 0.21.0 with several improvements, and YOLOv9 made its debut on the platform, now compatible with Transformers.js as per the discussions and platforms like Hugging Face spaces and huggingface.co/models. -
DSPy Grows Closer to Production: Exploration of DSPy and Gorilla OpenFunctions v2 is underway to transition from Gradio prototypes to production versions. The tools promise enhanced client onboarding processes for foundation models without prompting, and the discussions and resources can be found in repositories like stanfordnlp/dspy on GitHub.
-
BitNet Bares Its Teeth: A new 1-bit Large Language Model, BitNet b1.58, boasted to preserve performance with impressive efficiency metrics, is discussed with its research available via this arXiv paper.
-
Inference Challenges and Solutions: In the field of text inference, an AI professional ran into issues when trying to deploy the text generation inference repository on a CPU-less and non-CUDA system. This highlights typical environmental constraints encountered in AI model deployment.
LAION Discord Summary
-
AIās Ideogram Stirs Interest: Engineers discussed the release of a new AI model from Ideogram, drawing comparisons with Stable Diffusion and shedding light on speculated quality matters pertaining to unseen Imagen samples. A user shared a prompt result that sparked a debate on its prompt adherence and aesthetics.
-
Integration of T5 XXL and CLIP in SD3 Discussed: There have been discussions around the potential integration of T5 XXL and CLIP models into Stable Diffusion 3 (SD3), with participants expecting advancements in both the precision and the aesthetics of upcoming generative models.
-
Concerns Over AI-Generated Art: A legal discussion unfolded concerning AI-generated art and copyright laws, referencing a verdict from China and an article on copyright safety for generative AI, highlighting uncertainty in the space and varied industry responses to DMCA requests.
-
Spiking Neural Networks Back in Vogue?: Some members considered the potential resurgence of spiking neural networks with advanced techniques like time dithering to improve precision, reflecting on historical and current research approaches.
-
State-of-the-Art Icon Generation Model Released: A new AI icon generation model has been released on Hugging Face, developed with a personal funding of $2,000 and touted to create low-noise icons at 256px, although scale limitations were acknowledged by its creator.
Nous Research AI Discord Summary
-
Emoji Storytelling on GPT-5ās No-show: Community members used a sequence of emojis to express sentiments about GPT-5ās absence, oscillating between salutes, skulls, and tears, while revering GPT iterations up to the mythical GPT-9.
-
Dellās Dual Connection Monitors and Docks Intrigue Engineers: A YouTube review of Dellās new 5K monitor and the Dell Thunderbolt Dock WD22TB4 piqued interest for their capabilities to connect multiple machines, with eBay as the suggested source for purchases.
-
1-bit LLMs Unveiled with BitNet B1.58: The arXiv paper revealed BitNet b1.58 as a 1-bit LLM with performance on par with full-precision models, highlighting it as a cost-effective innovation alongside a mention of Nicholas Carliniās LLM benchmark.
-
Exploring Alternative Low-Cost LLMs and Fine-Tuning Practices: Users discussed alternatives to GPT-4, the effect of small training dataset sizes, and the potential use of Directed Prompt Optimization (DPO) to improve model responses.
-
Cutting-Edge Research and New Genomic Model Debut: Stanfordās release of HyenaDNA, a genomic sequence model, alongside surprising MMLU scores from CausalLM, and resources on interpretability in AI, such as Representation Engineering and tokenization strategies, were the hot topics of discussion.
Latent Space Discord Summary
-
Noam Shazeer on Coding Style:
@swyxiohighlighted Noam Shazeerās first blog post on coding style and shape suffixes, which may interest developers who are keen on naming conventions. -
AI in Customer Service: Enthusiasm was expressed around data indicating that LLMs can match human performance in customer service, potentially handling two-thirds of customer service queries, suggesting a pivot in how customer interactions are managed.
-
Learning with Matryoshka Embeddings: Members discussed the innovative āMatryoshka Representation Learningā paper and its applications in LLM embeddings with adaptive dimensions, with potential benefits for compute and storage efficiency.
-
MRL Embeddings Event: An announcement for an upcoming event by
<@206404469263433728>where the authors of the MRL embeddings paper will attend was made, providing an opportunity for deep discussions on representation learning in the#1107320650961518663channel. -
Representation Engineering Session:
@ivanleomksignaled an educational session on Representation Engineering 101 with<@796917146000424970>, indicating a chance to learn and query about engineering effective data representations in the#1107320650961518663channel.
Perplexity AI Discord Summary
-
Rabbit R1 Activation Assistance: User
@mithrilmanencountered a non-clickable email link issue when trying to activate the Rabbit R1 promo.@icelavamansuggested using the email link and reaching out to support. -
Podcast Identity Confirmation: Confusion arose around podcasts using the name āPerplexity AI,ā leading
@icelavamanto clarify with the official podcast link, while@ok.alexspeculated that the name might be used without authorization for attention or financial gain. -
Comparing AI Model Capabilities: Users explored the strengths and weaknesses of various AI models like Experimental, GPT-4 Turbo, Claude, and Mistral. There was notably divided opinion regarding Mistralās effectiveness for code queries.
-
Brainstorming Perplexity AI Improvements: Suggestions for Perplexity AI included exporting thread responses, a feature currently missing but considered for future updates. Issues also included the absence of file upload options and confusion over product name changes.
-
Model Performance Nostalgia and API Errors: Discussions touched upon glitches in text generation and fond memories of pplx-70b being superior to sonar models.
@jeffworthingtonfaced challenges with OpenAPI definitions, suggesting the current documentation might be outdated.
Links shared:
- Official Perplexity AI podcasts: āDiscover Daily by Perplexity and āPerplexity AI.
- Getting started with Perplexityās API: pplx-api documentation.
Eleuther Discord Summary
-
Foundation Model Development Cheatsheet Unveiled: A new resource titled The Foundation Model Development Cheatsheet has been released to aid open model developers, featuring contributions from EleutherAI, MIT, AI2, Hugging Face, among others, and focusing on often overlooked yet crucial aspects such as dataset documentation and licensing. The cheatsheet can be accessed as a PDF paper or an interactive website, with additional information in their blog post and Twitter thread.
-
Scaling Laws and Model Training Discussions Heat Up: Discourse ranges from inquiries about cross-attention SSM models, stable video diffusion training, and the nuances of lm-evaluation-harness, to the status of EleutherAIās Pythia model, and an abstract on a 1-bit Large Language Model (LLM). Notable references include a blog post on Multiple Choice Normalization in LM Evaluation and the research paper on the Era of 1-bit LLMs.
-
From Open-Sourced Models To Maze Solving Diffusion Models: The research channel showcased discussions on a variety of AI topics, from open-sourced models and pretraining token-to-model size ratios to diffusion models trained to solve mazes, prompting engineering transfer studies, and the practical challenges of sub 8-bit quantization. Key resources shared include a Stable LM 2 1.6B Technical Report, and a tweet on training diffusion models to solve mazes by FranƧois Fleuret.
-
Neox Query for Slurm Compatibility: User
@muwndsought recommendations on running Neox with Slurm and its compatibility with containers. It was highlighted that Neoxās infrastructure does not make assumptions about the userās setup, and a slurm script may be needed for multinode execution. -
Interpretability Techniques and Norms Explored: Conversations in the interpretability channel delved into matrix norms and products, RMSNorm layer applications, decoding using tuned lenses, and the proper understanding of matrix norm terminology. For example, the Frobenius norm is the Euclidean norm when the matrix is flattened, while the ā2-normā is the spectral norm or top singular value.
-
Tweaks for LM Eval Harness and Multilingual Upgrades: Enhancements to the LM Eval harness for chat templates were shared, along with news that higher-quality translations for the Multilingual Lambada have been contributed by
@946388490579484732and will be included in the evaluation harness. These datasets are made available on Hugging Face.
LangChain AI Discord Summary
-
Confidence in LangChain.js:
@ritanshooraised a question regarding confidence score checks when utilizing LangChain.js for RAG. While an immediate answer was not provided, users were referred to the LangChain documentation for in-depth guidance. -
Integration Queries for LangChain: Technical discussions highlighted the possibilities of memory addition to LCEL and effective language integration with LangChain in an Azure-hosted environment. Users were advised to consult official documentation or seek community assistance for specific integration issues.
-
ToolException Workarounds Explored:
@abinandan soughtadvice on retrying a tool after aToolExceptionoccurs with a custom tool. The community pointed to LangChain GitHub discussions and issues for potential solutions. -
LangServe Execution Quirks:
@thatdcreported missing intermediate step details when using langserve, as opposed to direct invocation from the agent class. They identified a potential glitch in theRemoteRunnablerequiring a workaround. -
Summoning Python Template Alchemists:
@tigermusksought assistance creating a Python template similar to the one available on Smith LangChain Chat JSON Hub, sparking discussions on template generation. -
āLangChain in your Pocketā Celebrated:
@mehulgupta7991announced their book āLangChain in your Pocket,ā recently featuring in Googleās Best books on LangChain, highlighting resources for LangChain enthusiasts. -
Beta Testing for AI Voice Chat App: Pablo, an AI Voice Chat app that integrates multiple LLMs and provides voice support without typing, called for beta testers. Engineers were invited to join the team behind this app, leveraging LangChain technology, with an offer for free AI credits.
-
AI Stock Analysis Chatbot Creation Explained: A video tutorial was shared by
@tarikkaoutar, demonstrating the construction of an AI stock analysis chatbot using LangGraph, Function call, and YahooFinance, catering to engineers interested in multi-agent systems. -
Groqās Hardware Reveal Generates Buzz: An introduction to Groqās breakthrough Language Processing Unit (LPU) suitable for LLMs captivated tech enthusiasts, conveyed through a YouTube showcase shared by
@datasciencebasics.
(Note: The above summary integrates topics and resources from various channels within the Discord guild, focusing on points of interest most relevant to an engineer audience looking for technical documentation, coding integration, and advancement in AI hardware and applications.)
OpenAccess AI Collective (axolotl) Discord Summary
-
Jupyter Configuration Chaos: Users reported issues with Jupyter notebooks, highlighting error messages concerning extension links and a āBad config encountered during initializationā without a conclusive solution in the discussion.
-
BitNet b1.58 Breakthroughs: An arXiv paper introduced BitNet b1.58, a 1-bit LLM that matches the performance of full-precision models, heralding significant cost-efficiency with an innovative architecture.
-
Sophia Speeds Past Adam: The Sophia optimizer, claimed to be twice as fast as Adam algorithms, was shared alongside its implementation code, sparking interest in its efficiency for optimization methods in AI models.
-
DropBP Drops Layers for Efficiency: A study presented Dropping Backward Propagation (DropBP), a method that can potentially reduce computational cost in neural network training by skipping layers during backward propagation without significantly affecting accuracy.
-
Scandinavian Showdown: Mistral vs. ChatGPT 3.5: A user, le_mess, reported that their 7B Mistral model rivaled ChatGPT 3.5 in performance for Danish language tasks, using an iterative synthetic data approach for progressive training through 30 iterations and initial human curation.
LlamaIndex Discord Summary
- Groqās Integration Powers Up LlamaIndex: The Groq LPU now supports LlamaIndex, including
llama2andMixtralmodels, aimed at improving Large Language Model (LLM) generation with a comprehensive cookbook guide provided for streamlining application workflows. - LlamaIndex Services Expand and Optimize: LlamaParse reported significant usage leading to a usage cap increase and updates towars uncapped self-serve usage, while a new strategy using LLMs for alpha parameter adjustment in hybrid search has been shared in this insight. Plus, a RAG architecture combining structured and unstructured data by
@ClickHouseDBhas been highlighted, which can be read about here. - Technical Insights and Clarifications Heat Up LlamaIndex Discussions: Indexing the latest LlamaIndex docs is in consideration with mendable mentioned as a tool for docs, while
@cheesyfishescomments on an anticipated refactor ofCallbackHandlerin Golang. A combination of FlagEmbeddingReranker with CohereReranker was identified as a tactic despite the absence of comparison metrics, and@cheesyfishesexplained that while LlamaIndex serves data to LLMs, Langchain is a more encompassing library. - Model Behaviors Questioned Within AI Community: Thereās a discussion about model decay with
@.sysfornoting degrading outputs from their models and@cheesyfishesreinforcing that models do not decay but input issues can affect performance. The concern extends to fine-tuned models underperforming when compared to baseline models.
OpenRouter (Alex Atallah) Discord Summary
-
Claude Encounters a Conversational Hiccup: Claude models from Anthropics were reported to have an error with chats having more than 8 alternating messages. The problem was acknowledged by
@louisgvwith a promise of an upcoming fix. -
Turn Taking Tweaks for OpenRouter:
@alexatallahsuggested a workaround for Claudeās prompt errors involving changing the initial assistant message to a system message. Development is ongoing to better handle conversations initiated by the assistant. -
OpenRouterās Rate Limit Relay: When asked about rate limits for article generation,
@alexatallahclarified that individually assigned API keys for OpenRouter users would have separate limits, presumably allowing adequate collective throughput. -
Mistralās Suspected Caching Unearthed: Users noticed repeat prompt responses from Mistral models suggesting caching might be at play.
@alexatallahconfirmed the possibility of query caching in Mistralās API. -
Prepaid Payment Puzzles for OpenRouter:
@fakeleiikunraised a question about the acceptance of prepaid cards through OpenRouter, and@louisgvresponded with possible issues tied to Stripeās fraud prevention mechanisms, indicating mixed support.
CUDA MODE Discord Summary
-
Benchmarking Bounties:
@hdcharles_74684improved a benchmark script for Triton kernels, which may outperform cuBLAS in specific scenarios such as batch sizes greater than 1, pertinent to applications like sdxl-fast. In light of potential Triton optimizations, focusing on technologies such as Torch.compile could address bottlenecks when handling batch size of 2. -
Triton Turmoil and Triumphs: Users encountered debugging issues with Triton versions 3.0.0 and 2.2.0; a workaround involved setting the
TRITON_INTERPRETenvironment variable. Moreover, stability concerns were voiced regarding Tritonās unpredictable segfaults compared to CUDA, prompting a request for comparative examples to understand the inconsistencies. -
FP8 Intrinsics Intact: In response to a query based on a tweet,
@zippikaclarified that FP8 intrinsics are still documented in the CUDA math API docs, noting that FP8 is primarily a data format and not universally applied for compute operations. -
Compiler Conundrums: In the realm of deep learning, skepticism was expressed about the usefulness of polyhedral compilation for optimizing sharding. This ties into the broader discussion about defining cost functions, the complexity of mapping DL programs to hardware, and whether top AI institutions are tackling these optimization challenges.
-
Ring Attention Riddles: A comparison was proposed for validating the correctness and performance of Ring Attention implementations, as potential bugs were noted in the backward pass, and GPU compatibility issues surfaced. User
@iron_boundsuggested there may be breakage in the implementation per commit history analysis, stressing the need for careful code review and debugging.
Interconnects (Nathan Lambert) Discord Summary
-
European Independence and Open-Weight Ambitions: Arthur Mensch emphasized the commitment to open-weight models, specifically mentioning 1.5k H100s, and highlighted a reselling deal with Microsoft. Le Chat and Mistral Large are attracting attention on La Plateforme and Azure, showing growth and a quick development approach. Here are the details.
-
Starcoder2 Breaks New Ground: The Stack v2, featuring over 900B+ tokens, is the powerhouse behind StarCoder2, which flaunts a 16k token context and is trained on more than 4T+ tokens. It represents a robust addition to the coding AI community with fully open code, data, and models. Explore StarCoder2.
-
Metaās Upcoming Llama 3: A report from Reuters indicates that Meta is gearing up to launch Llama 3 in July, signaling a potential shake-up in the AI language model landscape. The Information provided additional details on this forthcoming release. Further information available here.
-
DeepMind CEOās Insights Captivate Nathan: Nathan Lambert tuned into a podcast featuring Demis Hassabis of Google DeepMind, covering topics such as superhuman AI scaling, AlphaZero combining with LLMs, and the intricacies of AI governance. These insights are accessible on various platforms including YouTube and Spotify.
-
Open AI and Personal Perspectives: The conversation between Nathan and Mike Lambert touched on the nature and importance of open AI and the differing thought models when compared to platforms like Twitter. Additionally, Mike Lambert, associated with Anthropic, expressed a preference to engage in dialogues personally rather than as a company representative.
LLM Perf Enthusiasts AI Discord Summary
- A Buzz for Benchmarking Automation: Engineers
@ampdotand@dare.aiare keen on exploring automated benchmark scripts, with the latter tagging another user for a possible update on such a tool. - Springtime Hopes for Llama 3:
@res6969predicts a spring release for Llama 3, yet hints that the timeline could stretch, while@potrockis hopeful for last-minute updates, particularly intrigued by the potential integration of Gemini ring attention. - The Testing Time Dilemma:
@jeffreyw128voices the challenge of time investment needed for comprehensive testing of new LLMs, aiming for an adequate āvibe checkā on each model. - ChatGPT Search Speculation Surfaces: Rumors of an impending OpenAI update to ChatGPTās web search features were mentioned by
@jeffreyw128, with@res6969seeking more reliable OpenAI intel and curious about resources for deploying codeinterpreter in production.
DiscoResearch Discord Summary
-
DiscoLM Template Usage Critical:
@bjoernpunderscored the significance of utilizing the DiscoLM template for proper chat context tokenization, pointing to the chat templating documentation on Hugging Face as a crucial resource. -
Chunking Code Struggles with llamaindex:
@sebastian.bodzaencountered severe issues with the llamaindex chunker for code, which is outputting one-liners despite thechunk_linessetting, suggesting a bug or a need for tool adjustments. -
Pushing the Boundaries of German AI:
@johannhartmannis working on a German RAG model using Deutsche Telekomās data, seeking advice on enhancing the German-speaking Mistral 7b model reliability, while@philipmaydelved into generating negative samples for RAG datasets by instructing models to fabricate incorrect answers. -
German Language Models Battleground: A debate emerged over whether Goliath or DiscoLM-120b is more adept at German language tasks, with
@philipmayand@johannhartmannweighing in;@philipmayposted the Goliath model card on Hugging Face for further inspection. -
Benchmarking German Prompts and Models:
@crispstroberevealed that EQ-Bench now includes German prompts, with the GPT-4-1106-preview model leading in performance, and provided a GitHub pull request link; they mentioned translation scripts being part of the benchmarks, effectively translated by ChatGPT-4-turbo.
Datasette - LLM (@SimonW) Discord Summary
- JSON Judo Techniques Remain Hazy:
@dbreunigverbalized the common challenge of dealing with noisy JSON responses, though specifics on the cleanup techniques or functions were not disclosed. - Silencing Claudeās Small Talk:
@justinpinkneyrecommended using initial characters like<rewrite>based on Anthropicās documentation to circumvent Claudeās default lead-in phrases such as āSure hereās aā¦ā. - Brevity Battle with Claude:
@derekpwillisexperimented with several strategies for attaining shorter outputs from Claude, including forcing the AI to begin responses with{, but admitted that Claude still tends to include prefatory explanations.
Skunkworks AI Discord Summary
An Unexpected Recruitment Approach: User .papahh directly messaged @1117586410774470818, indicating a job opportunity and showing enthusiasm for their potential involvement.
Alignment Lab AI Discord Summary
- Value Hunting Across Species:
@taodoggyis inviting collaboration on a project to probe into the biological and evolutionary origins of shared values among species, refine value definitions, and explore their manifestation in various cultures. The project overview is accessible via a Google Docs link.
PART 2: Detailed by-Channel summaries and links
TheBloke ā· #general (1070 messagesš„š„š„):
- Discord Detects Spammer: Users noticed messages flagged for likely spam in the chat, particularly from
@kquant, who was reported for messaging over 100 people with the same message, triggering Discordās spam detection system. - Exploring ChatGPT Performance: Users like
@itsme9316and@notreimudiscussed their varying experiences with ChatGPT models. Some noted that GPT-4ās API was unreliable for them compared to alternatives like Mixtral or Miqu models. - Model Merging Conversations: Various users, including
@itsme9316and@al_lansley, discussed model merging and how it doesnāt always result in smarter models. There was consensus that merging often depends on luck and the modelsā compatibility. - Concerns Over Contaminated Training Data: Users such as
@itsme9316expressed concerns about modern LLMs potentially being contaminated with outputs from other models like OpenAIās, which could affect quality and reliability. - Quantization and Model Performance: There was discussion led by
@notreimuand@aiwaldohabout the performance differences between high-parameter models with low bit-per-weight (bpw) quantization and smaller models with higher bpw. Users shared varying experiences with different quantized models.
Links mentioned:
- Database Search: Search our database of leaked information. All information is in the public domain and has been compiled into one search engine.
- A look at Appleās new Transformer-powered predictive text model: I found some details about Appleās new predictive text model, coming soon in iOS 17 and macOS Sonoma.
- Microsoft-backed OpenAI valued at $80bn after company completes deal: Company to sell existing shares in ātender offerā led by venture firm Thrive Capital, in similar deal as early last year
- Sad GIF - Sad - Discover & Share GIFs: Click to view the GIF
- writing-clear.png Ā· ibm/labradorite-13b at main: no description found
- And death shall have no dominion: And death shall have no dominion. / Dead men naked they shall be one
- NousResearch/Nous-Hermes-2-Mistral-7B-DPO Ā· Hugging Face: no description found
- Uncensored Models: I am publishing this because many people are asking me how I did it, so I will explain. https://huggingface.co/ehartford/WizardLM-30B-Uncensored https://huggingface.co/ehartford/WizardLM-13B-Uncensoreā¦
- BioMistral/BioMistral-7B Ā· Hugging Face: no description found
- NousResearch/Nous-Hermes-2-SOLAR-10.7B Ā· Hugging Face: no description found
- adamo1139 (Adam): no description found
- p1atdev/dart-v1-sft Ā· Hugging Face: no description found
- google/gemma-7b-it Ā· Buggy GGUF Output: no description found
- Attack of the stobe hobo.: Full movie. Please enjoy. Rip Jim Stobe.
- Fred again..: Tiny Desk Concert: Teresa Xie | April 10, 2023When Fred again.. first proposed a Tiny Desk concert, it wasnāt immediately clear how he was going to make it work ā not because hā¦
- My Fingerprint- Am I Unique ?: no description found
- GitHub - MooreThreads/Moore-AnimateAnyone: Contribute to MooreThreads/Moore-AnimateAnyone development by creating an account on GitHub.
- adamo1139/rawrr_v2 Ā· Datasets at Hugging Face: no description found
TheBloke ā· #characters-roleplay-stories (511 messagesš„š„š„):
-
LLM Roleplay Discussion: Users discussed the effectiveness of using Large Language Models (LLMs) for role-playing characters, including techniques for crafting character identities, such as telling the LLM āyou are a journalistā to improve performance.
@nathaniel__suggested successful strategies involve assigning roles and detailed personalities and@maldevideshared a prompt structuring approach using#definesyntax. -
Character Consistency: Several users, including
@shanman6991and@superking__, explored whether character consistency can be improved by giving LLMs detailed backstories and personality traits. There was particular interest in techniques to allow characters to lie or scheme convincingly within role-play scenarios. -
Prompt Engineering Tactics:
@maldevidediscussed the use of proper names and declarative statements in prompts to guide LLMs into desired patterns of conversation, while@superking__provided examples of instruct vs. pure chat mode setups for better model guidance. -
Model Selection for Roleplay: Users like
@superking__indicated a preference for specific models such as miqu and mixtral for role-play purposes, often eschewing the use of system prompts. There was also mention of the potential for models to become less coherent with longer context lengths, and strategies to offset this were discussed. -
Naming Conventions in LLMs:
@gryphepadarand@maldevideobserved that certain names, like āLyraā and āLilyā, seem to be particularly common in responses when LLMs are prompted to generate character names, leading to some speculation about the training dataās influence on these naming trends.
Links mentioned:
- Let Me In Eric Andre GIF - Let Me In Eric Andre Wanna Come In - Discover & Share GIFs: Click to view the GIF
- Sad Smoke GIF - Sad Smoke Pinkguy - Discover & Share GIFs: Click to view the GIF
- Why Have You Forsaken Me? GIF - Forsaken Why Have You Forsaken Me Sad - Discover & Share GIFs: Click to view the GIF
- maldv/conversation-cixot Ā· Datasets at Hugging Face: no description found
- Hawk Eye Dont Give Me Hope GIF - Hawk Eye Dont Give Me Hope Clint Barton - Discover & Share GIFs: Click to view the GIF
- GitHub - UltiRTS/PrometheSys.vue: Contribute to UltiRTS/PrometheSys.vue development by creating an account on GitHub.
- GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs - predibase/lorax
TheBloke ā· #training-and-fine-tuning (86 messagesš„š„):
- Perplexity AI as a New Tool: User
@icecream102suggested trying out Perplexity AI as a resource. - Budget Training with QLoRA:
@dirtytigerxadvised that training large language models like GPT can be expensive and suggested using techniques like QLoRA to limit hardware requirements, though noting it would still take many hours of compute. - Training and Inference Cost Estimates: In a discussion on estimating GPU hours for training and inference,
@dirtytigerxrecommended conducting a tiny test run and looking at published papers for benchmarks. - Model Training Dynamics Discussed:
@cogbujiquestioned training a model with a static low validation loss, prompting@dirtytigerxto suggest altering the validation split and taking deduplication steps to address discrepancies. - Model Generalization and Hallucination Concerns:
@dirtytigerxand@cogbujidiscussed training model generalization and the inevitable problem of hallucination during inference, suggesting the use of retrieval mechanisms and further evaluation strategies.
Links mentioned:
cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 Ā· Hugging Face: no description found
TheBloke ā· #model-merging (6 messages):
- Tensor Dimension Misalignment Issue:
@falconsflypointed out that an issue arose due to a single bit being misplaced or misaligned, resulting in incorrect tensor dimensions. - Appreciation Expressed for Information:
@222gatethanked@falconsflyfor sharing the information about the tensor dimension problem. - Query about Slerp or Linear Techniques:
@222gateasked if the discussed merging techniques involved spherical linear interpolation (slerp) or just linear ties. - Reflection on Diffusion Test Techniques: In response,
@alphaatlas1mentioned not being certain about@222gateās specific query but shared that their diffusion test used dare ties and speculated that a HuggingFace test may have involved dare task arithmetic. - Recommendation for Concatenation in Merging:
@alphaatlas1suggested trying concatenation for anyone doing the peft merging, stating it works well and noting thereās no full-weight merging analogue for it.
TheBloke ā· #coding (8 messagesš„):
-
Eager for Collaboration:
@wolfsaugeexpresses enthusiasm to learn from@falconsfly, anticipating a discussion on fresh ideas for enhancement after dinner. -
No GPU, No Speed?:
@dirtytigerxstates that without a GPU, speeding up processes is challenging, offering no alternative solutions for performance improvement. -
APIs for Acceleration:
@tom_lrdsuggests using APIs as an alternative to speed up processes, listing multiple services like huggingface, together.ai, and mistral.ai. -
Looking Beyond Colab for Hosted Notebooks: Despite
@dirtytigerxmentioning the lack of hosted notebooks on platforms provided by cloud providers,@falconsflypoints out that Groq.com offers fast inference. -
Modular MAX Enters the Game:
@dirtytigerxshares news about the general availability of the modular MAX platform, announcing the developer edition preview and its vision to democratize AI through a unified, optimized infrastructure.
Links mentioned:
Modular: Announcing MAX Developer Edition Preview: We are building a next-generation AI developer platform for the world. Check out our latest post: Announcing MAX Developer Edition Preview
Mistral ā· #general (992 messagesš„š„š„):
-
NVIDIAās Chat with RTX Demo Criticized: Users like
@netrveexpressed disappointment with NVIDIAās āChat with RTXā demo, which was meant to showcase retrieval-augmented generation (RAG) capabilities. The demo, which limited context size to 1024 tokens, faced issues with retrieving correct information and delivering coherent answers. NVIDIAās use of LangChain in the reference architecture for RAG was also questioned. -
OpenAI and Meta Licensing Discussions: There was a heated discussion spearheaded by
@i_am_domand@netrveregarding Mistral AIās usage of Metaās LLaMa model, potential licensing issues, and implications of commercial use. The consensus suggested that an undisclosed agreement between Mistral and Meta was possible, given the seeming compliance with Metaās licensing terms. -
Conversations about Mistral AIās Open Weight Models:
@mrdragonfox,@tarruda, and others discussed Mistral AIās commitment to open weight models and speculated about future releases following the Mistral-7B model. The community expressed trust and expectations towards Mistral for providing more open weight models. -
RAG Implementation Challenges Highlighted: Several users, including
@mrdragonfoxand@shanman6991, discussed the complexities of implementing RAG systems effectively. They mentioned the significant impact of the embedding model on RAG performance and the difficulty in achieving perfection with RAG, often taking months of refinement. -
Mistral AI and Microsoft Deal Scrutinized: An investment by Microsoft in Mistral AI raised discussions about the size of the investment and its implications for competition in the AI space.
@ethuxshared information hinting that the investment was minimal, while@i_am_domraised concerns about Microsoftās cautious approach due to potential complexities with open-source models like Miqu.
Links mentioned:
- What Is Retrieval-Augmented Generation aka RAG?: Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys: no description found
- Klopp Retro GIF - Klopp Retro Dancing - Discover & Share GIFs: Click to view the GIF
- Basic RAG | Mistral AI Large Language Models: Retrieval-augmented generation (RAG) is an AI framework that synergizes the capabilities of LLMs and information retrieval systems. Itās useful to answer questions or generate content leveraging ā¦
- mlabonne/NeuralHermes-2.5-Mistral-7B Ā· Hugging Face: no description found
- Legal terms and conditions: Terms and conditions for using Mistral products and services.
- Microsoft made a $16M investment in Mistral AI | TechCrunch: Microsoft is investing ā¬15 million in Mistral AI, a Paris-based AI startup working on foundational models.
- Client code | Mistral AI Large Language Models): We provide client codes in both Python and Javascript.
- NVIDIA Chat With RTX: Personnalisez et dĆ©ployez votreĀ chatbotĀ dāIA.
- Microsoft made a $16M investment in Mistral AI | TechCrunch: Microsoft is investing ā¬15 million in Mistral AI, a Paris-based AI startup working on foundational models.
- Mistral Large vs GPT4 - Practical Benchmarking!: ā”ļø One-click Fine-tuning & Inference Templates: https://github.com/TrelisResearch/one-click-llms/ā”ļø Trelis Function-calling Models (incl. OpenChat 3.5): httpā¦
- Short Courses: Take your generative AI skills to the next level with short courses fromĀ DeepLearning.AI. Enroll today to learn directly from industry leaders, and practice generative AI concepts via hands-on exercisā¦
Mistral ā· #models (12 messagesš„):
- More Meaningful Error Messages on Mistral:
@lerelaaddressed an issue regarding system limitations, stating that a certain operation is not permitted with the large model, but users will now receive a more meaningful error message. - Discussion on System/Assistant/User Sequence:
@skisquawremarked on having to change the sequence from system/assistant/user to user/assistant/user due to the model treating the first user input as a system one, despite a functionality need where assistant prompts follow system commands. - Quantization Packs Mistral-7B Parameters:
@chrismccormick_inquired about the parameter count of Mistral-7B, originally tallying only around 3.5B. They later deduced that 4-bit quantization likely halves the tensor elements. - Large Code Segments Questioned for Mistral:
@frigjordcontemplated whether querying long code segments, especially more than 16K tokens, might pose a problem for Mistral models. - Complex SQL Queries with Mistral-7B:
@sanipanwalaasked about generating complex SQL queries with Mistral-7B, and@tom_lrdresponded affirmatively, providing advice on formulating the queries and even giving an example for creating a sophisticated SQL query.
Mistral ā· #deployment (174 messagesš„š„):
-
Mistral Deployment Conundrum:
@arthur8643inquired about hardware requirements for running Mistral 8x7B locally, contemplating a system upgrade. Users@_._pandora_._and@mrdragonfoxadvised that his current setup wouldnāt suffice, recommending at least 100GB of VRAM for full precision deployment, and suggesting the use of services like together.ai for assistance. -
Debates on Optimal Server Specs:
@latoile0221sought advice on server specifications for token generation, considering a dual CPU setup and RTX 4090 GPU. The user received mixed responses regarding the importance of CPU versus GPU;@ethuxstressed the GPUās significance for inference tasks while discussions circled around the necessity of substantial VRAM for full precision models. -
Quantization Qualms and GPU Capabilities: Various participants expressed that quantized models underperform, with
@frigjordand@ethuxnoting that quantized versions arenāt worthwhile for coding tasks. The consensus emerged that substantial VRAM (near 100GB) is needed to run non-quantized, full-precision models effectively. -
Self-Hosting, Model Types, and AI Limitations: Dialogue ensued about the practicalities of self-hosting AI models like Mixtral, with mentions of utilizing quant versions and alternatives like GGUF format. Users including
@ethuxand@sublimatorniqshared experiences, with a focus on the limitations of quantized models and better performance of full models on high-spec hardware. -
On the Topic of Specialized AI Models: The discussion touched on the potential advantages and challenges of training a specialized JS-only AI model.
@frigjordand@mrdragonfoxdebated the effectiveness and handling of such focused models, with general agreement on the extensive work required to clean and prep datasets for any specialized AI training.
Links mentioned:
- Jurassic Park GIF - Jurassic Park World - Discover & Share GIFs: Click to view the GIF
- starling-lm: Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
- Tags Ā· mixtral: A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.
Mistral ā· #ref-implem (76 messagesš„š„):
- Typo Alert in Notebook:
@foxalabs_32486identified a typo in theprompting_capabilities.ipynbnotebook, where an extra āorā was present. The correct text should read āFew-shot learning or in-context learning is when we give a few examples in the promptā¦ā - Fix Confirmation: In response to
@foxalabs_32486ās notice,@sophiamyangacknowledged the error and confirmed the fix. - Typos Add Human Touch:
@foxalabs_32486mused about using occasional typos to make AI-generated content appear more human, sparking a discussion on the ethics of making AI seem human with@mrdragonfox. - Ethics over Earnings:
@mrdragonfoxdeclined projects aimed at humanizing AI beyond ethical comfort, underscoring a preference to choose integrity over financial gain. - AI Industry Hiring Challenges:
@foxalabs_32486discussed the difficulties in hiring within the AI industry due to a shortage of skilled professionals and the rapid expansion of knowledge required.
Mistral ā· #finetuning (15 messagesš„):
- Limiting Model Answers to Specific Documents:
@aaronbarreiroinquired about constraining a chatbot to only provide information from a specific document, such as one about wines, and not respond about unrelated topics like pizza. - The Challenge of Controlling LLMS:
@mrdragonfoxexplained that language models like LLMS will likely hallucinate answers, because they are designed fundamentally as next token predictors, thus a robust system prompt is vital to direct responses. - Language Models as Stateless Entities:
@mrdragonfoxhighlighted the stateless nature of language models, meaning they donāt retain memory like a human would, and if pushed beyond their token limitāspecifically mentioned the 32k contextāthey will forget earlier information. - Strategies to Maintain Context Beyond Limits:
@mrdragonfoxdiscussed strategies to circumvent the context limitation, such as using function calling or retrieval-augmented generation (RAG), but acknowledged these methods are more complex and donāt work directly out-of-the-box. - Fine-Tuning Time Depends on Dataset Size: When
@atipasked about the time required to fine-tune a 7B parameter model on H100 hardware,@mrdragonfoxstated it varies based on dataset size, implying the duration canāt be estimated without that information.
Mistral ā· #showcase (7 messages):
-
Teaching Economics with AI:
@patagonia50shared about creating an app for an intermediate microeconomics course that provides instant personalized feedback by making API calls to gpt-4-vision-preview and Mistral models. The app, which adapts to different questions and rubrics via a JSON file, has been deployed on Heroku and is still being refined, with future plans to expand its capabilities with Mistral AI models. -
Interest Expressed in Educational App:
@akshay_1showed interest in@patagonia50ās educational app, asking if there was a GitHub repository available for it. -
Open Source Plans: In response to
@akshay_1,@patagonia50indicated that there isnāt a GitHub repository yet but plans to create one for the educational app. -
Request for a Closer Look:
@akshay_1expressed a desire for a sneak peek at@patagonia50ās educational app, demonstrating enthusiasm for the project.
Links mentioned:
- cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 Ā· Hugging Face: no description found
- Use Mistral AI Large Model Like This: Beginner Friendly: We learn the features of High Performing Mistral Large and do live coding on Chat Completions with Streaming and JSON Mode. The landscape of artificial intelā¦
Mistral ā· #random (2 messages):
- Seeking the Google Million Context AI: User
@j673912inquired about how to access the elusive Google 1M Context AI. - Insider Connection Required:
@dawn.duskrecommended having direct contact with someone from Deepmind to gain access.
Mistral ā· #la-plateforme (41 messagesš„):
-
Mistral Function Calls Require Adjustments:
@michaelhungerdiscussed challenges with the Mistral function calling mechanism, noting the need for patches and system messages. Specifically, Mistralās behavior contrasts with expectations, often preferring additional tool calls over answering the userās query directly. -
Clarifying
tool_choiceBehavior:@liebkeexpressed confusion over the behavior oftool_choice="auto"in the context of Mistralās function calling, as the setting does not seem to trigger tool calls as anticipated.@sophiamyangsuggested that āautoā should work as intended, prompting a request for Liebkeās implementation details for further troubleshooting. -
Inconsistencies in Mistral Function Calling:
@alexclubsprovided feedback on integrating Mistral Function Calling into Profound Logic, noticing differences from OpenAIās tool behavior and a lack of consistency in when functions are triggered. -
Reproducibility of Outputs on Mistralās Platform Uncertain:
@alexli3146inquired about seedable outputs for reproducibility, while@foxalabs_32486and@sublimatorniqdiscussed potential issues and existing settings in the API that may affect it. -
Mistral Message Roles Must Follow Specific Order: After discussing error messages encountered with āmistral-large-latest,ā
@not__cooldiscovered that wrapping a user message with two system messages is not supported, as confirmed by@lerela. However,@skisquawsuccessfully used the user/assistant format with the system role message in the first user role statement.
Links mentioned:
- Technology: Frontier AI in your hands
- AI Assistants are the Future | Profound Logic: With Profound AI, you can enhance your legacy applications with natural language AI assistants in just 3 steps.
- AI Assistants are the Future | Profound Logic.): With Profound AI, you can enhance your legacy applications with natural language AI assistants in just 3 steps.
- GitHub - liebke/mechanician: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use.: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use. - liebke/mechanician
- mechanician/packages/mechanician_mistral/src/mechanician_mistral/mistral_ai_connector.py at main Ā· liebke/mechanician: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use. - liebke/mechanician
- mechanician/examples/notepad/src/notepad/main.py at main Ā· liebke/mechanician: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use. - liebke/mechanician
Mistral ā· #office-hour (1 messages):
- Mark Your Calendars for Evaluation Talk:
@sophiamyanginvites everyone to the next office hour on Mar. 5 at 5pm CET with a focus on evaluation and benchmarking. They express interest in learning about different evaluation strategies and benchmarks used by participants.
Mistral ā· #le-chat (423 messagesš„š„š„):
-
Le Chat Model Limit Discussions: User
@alexeyzaytsevinquired about the limits for Le Chat on a free account. Although currently undefined,@ethuxand@_._pandora_._speculated that future restrictions might mimic OpenAIās model, with advanced features potentially becoming paid services. -
Mistral on Groq Hardware:
@foxalabs_32486asked about plans to run Large on Groq hardware, while@ethuxnoted Groqās memory limitations.@foxalabs_32486provided a product brief from Groq, highlighting potential misconceptions about their hardwareās capabilities. -
Mistralās Market Position and Microsoft Influence: In an extensive discussion, users
@foxalabs_32486and@mrdragonfoxshared their perceptions of Mistralās market positioning and the influence of Microsoftās investment. They touched on topics like strategic hedging, the potential impact on OpenAI, and the speed of Mistralās achievements. -
Feedback for Le Chat Improvement: Several users, including
@sophiamyang, engaged in discussing ways to improve Le Chat. Suggestions included a āthumb downā button for inaccurate responses (@jmlb3290), ease of switching between models during conversations (@sublimatorniq), features to manage token counts and conversation context (@_._pandora_._), preserving messages on error (@tom_lrd), and support for image inputs (@foxalabs_32486). -
Debating Efficiency of Low-Bitwidth Transformers: Users, especially
@foxalabs_32486and@mrdragonfox, debated the implications of a low-bitwidth transformer research paper, discussing potential boosts in efficiency and the viability of quickly implementing these findings. They mentioned the work involved in adapting existing models and the speculative nature of immediate hardware advancements.
Links mentioned:
- Technology.): Frontier AI in your hands
- Why 2024 Will Be Not Like 2024: In the ever-evolving landscape of technology and education, a revolutionary force is poised to reshape the way we learn, think, andā¦
- Unsloth update: Mistral support + more: Weāre excited to release QLoRA support for Mistral 7B, CodeLlama 34B, and all other models based on the Llama architecture! We added sliding window attention, preliminary Windows and DPO support, and ā¦
- GitHub - unslothai/unsloth: 5X faster 60% less memory QLoRA finetuning: 5X faster 60% less memory QLoRA finetuning. Contribute to unslothai/unsloth development by creating an account on GitHub.
Mistral ā· #failed-prompts (6 messages):
-
Instructions for Reporting Failed Prompts:
@sophiamyangprovided a template requesting details for reporting failed prompts, specifying information likemodel,prompt,model output, andexpected output. -
Witty Math Mistake Report:
@blueaquilaehumorously flagged an issue regarding mathematics with the Mistral Large model with their comment, āmath, halfway there (pun intended) on large chatā. -
Tongue-in-Cheek Query Confirmation: In a playful exchange,
@notan_aiqueries whether a specific example counts as a failed prompt, to which@blueaquilaeresponds, āSynthetic data all the way?ā -
General Failures on le chat:
@blacksummer99reports that all versions of Mistral, including Mistral next, fail on a prompt given on le chat, without providing specifics. -
Incomplete Issue Indication:
@aiwaldohmentions āFondĆ©e en 2016?!ā possibly pointing out an issue or confusion with the Mistral modelās output, but no further details are provided.
Mistral ā· #prompts-gallery (5 messages):
-
Invitation to Share Prompt Mastery: User
@sophiamyangwelcomed everyone to share their most effective prompts, emphasizing prompt crafting as an art form and looking forward to seeing usersā creations. -
Confusion About Channel Purpose: After user
@akshay_1simply mentioned āDSPyā,@notan_airesponded with curiosity about āSudoLangā but expressed confusion regarding the purpose of the channel. -
Possible Model Mention with Ambiguity: The model name āMistral next le chatā was mentioned twice by
@blacksummer99, however, no further context or details were provided.
OpenAI ā· #ai-discussions (58 messagesš„š„):
-
Loader Choices for AI Models:
@drinkoblog.weebly.compointed out that lm studio requires manual GUI interaction to start the API, which is impractical for websites. They recommend using alternative loaders such as oobabooga or Jan dot ai for automation on boot. -
Automod Censorship on AI Discussions:
@chonkyman777reported their message was removed for showcasing problematic behavior by Copilot AI, and@eskcantasuggested reaching out to Discord mods via Modmail and reporting AI issues directly to OpenAI through their feedback form. Users debated the nuances of moderation and the scope of the rules in place. -
Concerns Over Mistral and Uncensored Content:
@dezuzelshared a YouTube video discussing Mistral, an AI model considered powerful and uncensored.@tariqaliraised questions about the implications of European AI regulation on Mistral, despite its promoted lack of censorship.@chief_executivecompared Mistral Large to GPT-4 and found the latter superior for coding tasks. -
Fine-Tuning GPT-3.5 for Chatbot Use Case:
@david_zoesought advice on fine-tuning GPT-3.5-Turbo to perform better than the baseline and maintain conversation flow, but faced challenges matching the performance of GPT-4.@elektronisaderecommended examining common use cases and consulting ChatGPT with actual data for further guidance on fine-tuning. -
Exploring Certifications for AI Specialization:
@navs02, a young developer, inquired about certifications for specializing in AI.@dezuzeland.doozadvised focusing on real-world projects over certifications and mentioned learning resources including courses by Andrew Ng and Andrej Karpathy on YouTube.
Links mentioned:
- Chat model feedback: no description found
- This new AI is powerful and uncensored⦠Letās run it: Learn how to run Mistralās 8x7B model and its uncensored varieties using open-source tools. Letās find out if Mixtral is a good alternative to GPT-4, and leaā¦
OpenAI ā· #gpt-4-discussions (21 messagesš„):
- Confusion Over API and File Uploads:
@ray_themad_nomadexpressed frustration with the chatbotās inconsistent responses after uploading files and creating custom APIs, noting that methods that worked months ago seem to fail now. - Clarifying Document Size Limitations:
@darthgustav.pointed out that the chatbot can only read documents within context size, and it will summarize larger files, which spurred a debate with@fawesumwho suggested that knowledge files can be accessed efficiently even if they are huge. - Seed Parameters Causing Inconsistent Outputs:
@alexli3146asked if anyone had success with getting reproducible output using the seed parameter, but shared that they havenāt. - Security Measures with Web Browsing and Code Interpreter:
@darthgustav.explained that using python to search knowledge files with the Code Interpreter can disable web browsing in the instance which is a security decision. - Proper Channel for Sharing The Memory Game:
@takk8isshared a link to āThe Memoryā but was redirected by@solbusto share it in the dedicated channel to avoid it getting lost in the chat.
OpenAI ā· #prompt-engineering (391 messagesš„š„):
-
Prompt Engineering with MetaPrompting:
@madame_architectshared their work on annotating āMetaPromptingā research, enhancing their compiled list of prompt architecture papers to 42 total. The article details a method integrating meta-learning with prompts, aimed at improving initializations for soft prompts in NLP models. MetaPrompting Discussion -
LaTeX and Katex in ChatGPT: Several users, including
@yami1010and@eskcanta, discussed the capabilities of ChatGPT in handling LaTeX and Katex for creating visual data representations, with a focus on math and flowchart diagrams. -
Curly Brackets Saga in DALL-E 3: Users such as
@darthgustav.and@beanz_and_riceencountered an issue where DALL-E 3 payloads were not accepting standard curly brackets in JSON strings. They found a workaround by using escape coded curly brackets, which appeared to bypass the parser error. -
Enhancing ChatGPT Creativity for Artistic Prompts: When asked about improving creativity in artistic prompts,
@bambooshootsand@darthgustav.suggested a multi-step iterative process and the use of semantically open variables to encourage less deterministic and more imaginative outputs from the AI. -
Challenges with Custom ChatGPT File Reading:
@codenamecookieand@darthgustav.discussed issues with Custom ChatGPTās inconsistent ability to read ā.pyā files from its knowledge. They explored potential solutions such as converting files to plain text and avoiding unnecessary zipping for better AI parsing and responsiveness.
Links mentioned:
Disrupting malicious uses of AI by state-affiliated threat actors: We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.
OpenAI ā· #api-discussions (391 messagesš„š„):
- Prompt Engineering Secrets:
@yami1010and@eskcantashared insights on using Markdown, LaTeX, and KaTeX in prompts with ChatGPT for creating diagrams and flowcharts. They discussed the effectiveness of different diagram-as-code tools, with mentions of mermaid and mathplotlib, and the peculiarities of dealing with curly brackets in the DALL-E 3 parser. - MetaPrompting Annotated:
@madame_architectadded MetaPrompting to their list of 42 annotated prompt architecture papers. The list, which can be found on the AI-Empower GitHub, is maintained to keep high-quality standards and is useful for researching prompt engineering. - The Curly Brackets Saga: A long discussion revolving around the DALL-E 3 payloadās formatting issues with curly brackets (
{},}) in JSON strings took place, with multiple users like@darthgustav.and@yami1010noting failures during image generation. A solution involving Unicode escape codes was found, bypassing the parser error. - Custom ChatGPT File Reading: In a conversation about Custom ChatGPT,
@codenamecookieexpressed confusion about the modelās inconsistent ability to read Python files from its āknowledgeā.@darthgustav.recommended not zipping the files and converting them to plain text while maintaining Python interpretation, which might help the AI process the files better. - Boosting AI Creativity: For enhancing AI-created artistic prompts, users like
@bambooshootsand@darthgustav.suggested using a multi-step process to develop the scene and elicit more creative responses from GPT-3.5 and GPT-4. The inclusion of semantically open variables and iterative prompting would help provoke less deterministic and more unique outputs.
Links mentioned:
Disrupting malicious uses of AI by state-affiliated threat actors: We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.
LM Studio ā· #š¬-general (484 messagesš„š„š„):
-
Exploring Model Options: Users are discussing various LLMs and their compatibility with specific GPUs, with a focus on coding assistance models such as Deepseek Coder 6.7B and StarCoder2-15B. For example,
@solusan.is looking for the best model to fit an Nvidia RTX 40 series with 12 GB, currently considering Dolphin 2.6 Mistral 7B. -
LM Studio GPU Compatibility Issues: Several users like
@jans_85817and@kerberos5703are facing issues running LM Studio with certain GPUs. Discussions revolve around LM Studioās compatibility mainly with newer GPUs, and older GPUs are presenting problems for which users are seeking solutions or alternatives. -
Hugging Face Outage Impact: A common issue reported by multiple members like
@barnleyand@heyitsyorkieis related to a network error when downloading models due to a Hugging Face outage affecting LM Studioās ability to search for models. -
Image Recognition and Generation Queries: Questions regarding image-related tasks surfaced, and
@heyitsyorkieclarified that while LM Studio cannot perform image generation tasks, it is possible to work with image recognition through Llava models. -
Hardware Discussions and Anticipations: Users like
@pierrunoytand@nink1are discussing future hardware expectations for AI and LLMs, noting that current high-end AI-specific hardware may become more accessible with time.
Links mentioned:
- GroqChat: no description found
- no title found: no description found
- š¾ LM Studio - Discover and run local LLMs: Find, download, and experiment with local LLMs
- Stop Shouting Arnold Schwarzenegger GIF - Stop Shouting Arnold Schwarzenegger Jack Slater - Discover & Share GIFs: Click to view the GIF
- BLOOM: Our 176B parameter language model is here.
- Continue: no description found
- no title found: no description found
- GeForce GTX 650 Ti | Specifications | GeForce: no description found
- MaziyarPanahi/dolphin-2.6-mistral-7b-Mistral-7B-Instruct-v0.2-slerp-GGUF Ā· Hugging Face: no description found
- Specifications | GeForce: no description found
- 02 ā Default and Notebook Tabs: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models. - oobabooga/text-generation-webui
- Add support for StarCoder2 by pacman100 Ā· Pull Request #5795 Ā· ggerganov/llama.cpp: What does this PR do? Adds support for StarCoder 2 models that were released recently.
- bigcode/starcoder2-15b Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- Anima/air_llm at main Ā· lyogavin/Anima: 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU - lyogavin/Anima
- GitHub - MDK8888/GPTFast: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch.: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch. - MDK8888/GPTFast
- itsdotscience/Magicoder-S-DS-6.7B-GGUF at main: no description found
LM Studio ā· #š¤-models-discussion-chat (61 messagesš„š„):
-
Seeking PDF chatbot guidance:
@solenya7755is looking to implement an accurate PDF chat bot with LM Studio and llama2 70B Q4 LLM, but experiences inaccuracies with hallucinated commands.@nink1suggests extensive prompt work and joining the AnythingLLM discord for further assistance. -
StarCoder2 and The Stack v2 launch:
@snoopbill_91704shares news about the launch of StarCoder2 and The Stack v2 by ServiceNow, Hugging Face, and NVIDIA, noting a partnership with Software Heritage aligned with responsible AI principles. -
Qualcomm releases 80 open source models:
@misangeniusbrings attention to Qualcommās release of 80 open source AI models, for vision, audio, and speech applications available on Huggingface. -
Querying Models that prompt you with questions:
@ozimandisinquires about local LLMs that ask questions and has mixed results with different models, while@nink1shares success in getting models like dolphin mistral 7B q5 to ask provocative questions. -
Best setup for business document analysis and writing:
@redcloud9999seeks advice on the best LLM setup for analyzing and writing business documents with a high-spec machine.@heyitsyorkieadvises searching for GGUF quants by āTheBlokeā on Huggingface and@coachdennis.suggests testing trending models.
Links mentioned:
- qualcomm (Qualcomm): no description found
- bigcode/starcoder2-15b Ā· Hugging Face: no description found
- bigcode/the-stack-v2-train-full-ids Ā· Datasets at Hugging Face: no description found
- Pioneering the Future of Code Preservation and AI with StarCoder2: Software Heritageās mission is to collect, preserve, and make the entire body of software source code easily available, especially emphasizing Free and Open Source Software (FOSS) as a digital cā¦
LM Studio ā· #š-hardware-discussion (42 messagesš„):
<ul>
<li><strong>Optimization Tips for Windows 11</strong>: `.bambalejo` advised users to disable certain features like microsheet's core isolation and vm platform on Windows 11 for better performance, and to ensure <em>VirtualizationBasedSecurityStatus</em> is set to 0.</li>
<li><strong>TinyBox Announcement</strong>: `senecalouck` shared a link with details on the TinyBox from TinyCorp, a new hardware offering found <a href="https://tinygrad.org">here</a>.</li>
<li><strong>E-commerce GPU Frustrations and Specs</strong>: `goldensun3ds` recounted a negative experience purchasing a falsely advertised GPU on eBay, opting for Amazon for their next purchase, listing their robust PC specs including dual RTX 4060 Ti 16GB.</li>
<li><strong>Old Hardware Nostalgia</strong>: A string of messages from users like `jans_85817`, `nullt3r`, `heyitsyorkie`, and `666siegfried666`, reminisced about older GPUs; the conversation included insights like the GTX 650 being unfit for modern models, and personal stories of past rigs and upgrades.</li>
<li><strong>Discussion on Nvidia Nvlink / SLI</strong>: Users `dub_ex` and `nullt3r` discussed the effectiveness of Nvidia Nvlink / SLI, concluding it is beneficial for model training but not necessarily for inference.</li>
</ul>
LM Studio ā· #š§Ŗ-beta-releases-chat (7 messages):
-
Inquiring about Image Insertion in LM Studio:
@heoheo5839was unsure about how to add an image into LM Studio as the āAssetsā bar wasnāt visible.@heyitsyorkieexplained that to add an image, one must use a model likePsiPi/liuhaotian_llava-v1.5-13b-GGUF/, ensure both the vision adapter (mmproj) and gguf of the model are downloaded, after which the image can be inserted in the input box for the model to describe. -
Questions about llava Model Downloads:
@hypocritipusqueried about the possibility of downloading llava supported models directly within LM Studio, alluding to easier accessibility and functionality. -
Clarifying llava Model Functionality in LM Studio:
@wolfspyrequestioned whether downloading llava models is a current functionality, suggesting that it might already be supported within LM Studio. -
Confirming Vision Adapter Model Use: In response to
@wolfspyre,@hypocritipusclarified they hadnāt tried to use the functionality themselves and were seeking confirmation on whether it was feasible to download both the vision adapter and the primary model simultaneously within LM Studio. -
Exploring One-Click Downloads for Vision-Enabled Models:
@hypocritipusshared an excerpt from the release notes indicating that users need to download a Vision Adapter and a primary model separately. They expressed curiosity about whether there is a one-click solution within LM Studio to simplify this process, where users could download both necessary files with a single action.
Links mentioned:
- Vision Models (GGUF) - a lmstudio-ai Collection: no description found
- Tweet from LM Studio (@LMStudioAI): Counting penguins can be challenging š§š§ New in LM Studio 0.2.9: š Local & Offline Vision Models! In this demo: the small and impressive Obsidian Vision 3B by @NousResearch.
LM Studio ā· #autogen (7 messages):
- Gemini vs. ChatGPT in Translation Tasks:
@hypocritipusshared their experience using Gemini and ChatGPT for translating psychological evaluation reports from Turkish to English, noting that Gemini generally provided better translations. - Struggle with Geminiās Overzealous Formatting:
@hypocritipusexpressed frustration with Geminiās tendency to add unnecessary bullet points and its habit of hallucinating content beyond the requested translation. - ChatGPT to the Rescue, Sort of: For the final report,
@hypocritipushad to switch to ChatGPT due to Gemini not delivering as expected, though they mentioned that ChatGPTās translation was inferior. - Accidental Message in Autogen:
@hypocritipushumorously noted they posted their experience in the Autogen channel by mistake, highlighted by a āLMFAO wrong place for me to post thisā¦ā comment. - Confusion Cleared Up:
@johnnyslanteyesasked for clarification on what@hypocritipusmeant by ātranslationā of the reports, which led to the explanation that it was a language translation from Turkish to English, not a conversion of medical jargon.
LM Studio ā· #langchain (3 messages):
- Dimensionality Details Disclosed: User
@npcomp_22591mentioned having positive outcomes using 768 dimensions for vectors. - Vectors 101: In response to an inquiry from
@bigsuh.ethon how to check vector dimensions,@npcomp_22591briefly explained the process: you can check the dimensionality of a vector by examining its length, providing an example output followed by.length.
LM Studio ā· #memgpt (1 messages):
jans_85817: i am are waiting that lm studio version for linux
HuggingFace ā· #announcements (1 messages):
-
Cosmopedia Unleashed:
@lunarfluannounced the release of Cosmopedia, touting it as the largest open synthetic dataset of textbooks, blogposts, and stories created by Mixtral with over 25B tokens and 30M files. Available resources linked through LinkedIn post. -
huggingface_hubLibrary Updates: The newhuggingface_hublibrary version 0.21.0 release was highlighted, featuring dataclasses,PyTorchHubMixinsupport, andaudio-to-audioinference among other updates. Developers can view the full release notes at the huggingface space. -
New Methods and Models on the Horizon: The posts shared exciting developments, including training a DoRA using diffusers script, pushing Figma frames to a dataset, and the debut of YOLOv9 on the hub with compatibility confirmed for Transformers.js. Additional updates covered
sentence-transformersv2.4.0, the LGM Mini project, and the possibility to run AWQ models on AMD GPUs. -
Innovations in Product: Googleās open LLM Gemma 7B is now available on Hugging Chat,
transformersreleased a new task guide for mask generation, and a newimage-feature-extractiontag was introduced, highlighting a model likegoogle/vit-base-patch16-224-in21k. -
Community Collaboration and Contributions: Community efforts led to the release of datasets such as
#data-is-better-togetherās10k_prompts_ranked, andOpenHermesPreferences. Furthermore, TTS Arena was launched for testing and rating text-to-speech models, and Fine-Tuning Gemma Models guide was made available on Hugging Faceās blog.
Links mentioned:
- @Wauplin on Hugging Face: āš Just released version 0.21.0 of the
huggingface_hubPython library!ā¦ā: no description found - Tweet from Victor M (@victormustar): 𤯠This @figma plugin lets you push your figma frames directly into a @huggingface dataset!
- Tweet from merve (@mervenoyann): YOLOv9 arrived on @huggingface Hub! 𤩠The model checkpoints: https://huggingface.co/merve/yolov9 Try the demo (@kadirnar_ai): https://huggingface.co/spaces/kadirnar/Yolov9 Find demo for YOLOv9 porā¦
- Tweet from Xenova (@xenovacom): YOLOv9 just released, and now itās compatible with š¤ Transformers.js! Thatās right⦠near real-time object detection running locally in your browser: no server required! 𤯠Try it out yoursā¦
- Tweet from Omar Sanseviero (@osanseviero): Matryoshka Embeddings are here! š„ The Sentence Transformers library allows training and running embedding models with embedding sizes that can be shrunk while keeping high quality! Learn about themā¦
- Tweet from dylan (@dylan_ebert_): LGM Mini š§ Image to Interactive 3D in 5 seconds https://huggingface.co/spaces/dylanebert/LGM-mini
- Tweet from Julien Chaumond (@julien_c): BREAKING: āļø Quoting Victor M (@victormustar) ⨠Googleās new open LLM Gemma 7B is now available on HuggingChat.
- Tweet from merve (@mervenoyann): š¤ transformers has a new task guide for mask generation (also known as zero-shot image segmentation) learn how to use the powerful segment-anything models in this guide https://huggingface.co/docs/tā¦
- Models - Hugging Face: no description found
- DIBT/10k_prompts_ranked Ā· Datasets at Hugging Face: no description found
- @davanstrien on Hugging Face: āThe open-source AI community can build impactful datasets collectively!ā¦ā: no description found
- Tweet from Lewis Tunstall (@_lewtun): šŖ½Introducing OpenHermesPreferences - the largest dataset of ~1 million AI preferences generated by Mixtral and Nous-Hermes-2-Yi-34B š„ https://huggingface.co/datasets/argilla/OpenHermesPreferences ā¦
- Tweet from Vaibhav (VB) Srivastav (@reach_vb): Announcing TTS Arena! š£ļø sound on One place to test, rate and find the champion of current open models. A continually updated space with the greatest and the best of the current TTS landscape! ā”ā¦
- Introducing the Red-Teaming Resistance Leaderboard: no description found
- AI Watermarking 101: Tools and Techniques: no description found
- Fine-Tuning Gemma Models in Hugging Face: no description found
- Tweet from Bassem Asseh š¤ (@asseh): .@huggingface worked together with @FetchRewards to take their document #AI solutions to production on @AWS . And guess what ? š āWith Yifengās guidance, Fetch was able to cut its development tā¦
HuggingFace ā· #general (491 messagesš„š„š„):
- GPU Pricing Queries:
@zorian_93363discussed the cost comparison between certain GPUs and a specific 3090 model. They mentioned the possibility of acquiring 100 units for the price of a single 3090 in their location. - Increasing Model Performance through Custom Frameworks:
@ahmad3794suggested that writing custom frameworks could unleash the potential of 4 teraflops on an 8-bit integrated circuit, offering considerable computing power. - Electronics DIY Enthusiasm:
@zorian_93363expressed a desire to play with electronics and build computers but lamented the lack of time due to an economic crisis, while appreciating othersā skills and abilities to innovate despite challenges. - Iranās Resourcefulness Amidst Sanctions:
@ahmad3794elaborated on building affordable clusters as a workaround for obtaining high-power technology, which is hard to get in Iran due to sanctions. - Accessing GPT Models and UI Challenges:
@welltoobadoand@caleb_soldiscussed the possibility and methods of using quantized versions of models for CPU inference without extensive RAM usage, with mentions of llama cpp as a beneficial tool.
Links mentioned:
- GroqChat: no description found
- Morph Studio: no description found
- Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique: no description found
- Hugging Face: Here at Hugging Face, weāre on a journey to advance and democratize ML for everyone. Along the way, we contribute to the development of technology for the better.
- 2869993 Hail GIF - 2869993 Hail - Discover & Share GIFs: Click to view the GIF
- Tweet from blob (@moanaris): no description found
- kopyl/ui-icons-256 Ā· Hugging Face: no description found
- Hugging Face ā The AI community building the future.: no description found
- Kermit Worried GIF - Kermit Worried Oh No - Discover & Share GIFs: Click to view the GIF
- Boom Explode GIF - Boom Explode Explosions - Discover & Share GIFs: Click to view the GIF
- Matrix Multiplication Background Userās Guide - NVIDIA Docs: no description found
- Hugging Face ā The AI community building the future.: no description found
- Gradio: no description found
- Tweet from Jason (@mytechceoo): ChatGPT wrappers when OpenAI is down..
- cahya/gpt2-small-indonesian-522M Ā· Hugging Face: no description found
- dpaste/15nGx (Python): no description found
- NCIS ridiculous hacking scene: one keyboard, two typists HD: no description found
- TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF at main: no description found
- The System Is Down- Strongbad: Wow.. Really didnāt think this video would be this popular. Apparently people come here when a server to a game is down. Ha! Epic.. Anyway, enjoy! Yes itās jā¦
- āHugging Face Outage Impact: Created with Gemini Advanced.
- The Website is Down #1: Sales Guy vs. Web Dude: The Website is Down: Sales Guy Vs. Web Dude High QualityThe original video in high resolution.This video won a Webby award!
- āHumanEvalā object has no attribute ādatasetā Ā· Issue #131 Ā· bigcode-project/bigcode-evaluation-harness: When I evaluate human eval with llama 7b, I met this problem: my script accelerate launch /cpfs01/shared/Group-m6/dongguanting.dgt/bigcode-evaluation-harness/main.py āmodel ā/path to my llama7b/ā¦
- Issues Ā· huggingface/api-inference-community: Contribute to huggingface/api-inference-community development by creating an account on GitHub.
- Workflow runs Ā· huggingface/text-embeddings-inference: A blazing fast inference solution for text embeddings models - Workflow runs Ā· huggingface/text-embeddings-inference
- GitHub - comfyanonymous/ComfyUI: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. - comfyanonymous/ComfyUI
- Issue with offline mode Ā· Issue #4760 Ā· huggingface/datasets: Describe the bug I canāt retrieve a cached dataset with offline mode enabled Steps to reproduce the bug To reproduce my issue, first, youāll need to run a script that will cache the dataset imā¦
- Issues Ā· huggingface/huggingface_hub: The official Python client for the Huggingface Hub. - Issues Ā· huggingface/huggingface_hub
- Build software better, together: GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.
- Add PatchModelAddDownscale (Kohya Deep Shrink) node. Ā· comfyanonymous/ComfyUI@bd07ad1: By adding a downscale to the unet in the first timesteps this node lets you generate images at higher resolutions with less consistency issues.
- Hugging Face status : no description found
HuggingFace ā· #today-im-learning (8 messagesš„):
- Exploring DSPy and OpenFunctions v2: User
@n278jmis investigating DSPy, a framework for programming foundation models without prompting, and Gorilla OpenFunctions v2, an advanced open-source function calling system for LLMs. They aim to use these tools to improve their client on-boarding process, making the move from Gradio prototypes to production-ready versions. - Harness the Power of OpenAI and Hugging Face:
@davidre95encourages users to utilize the tools from OpenAI Chat and Hugging Face chat room as resources. - Project Collaboration on Invoice Processing:
@pampkinparty000invites users dealing with PDF or picture invoices to DM them for a potential collaboration on a project with similar goals. - Invoice Storage Advice for Greater Efficiency:
@pampkinparty000recommends storing invoices in a vectorized database with metadata for more efficient use of LLMs, suggesting the use of libraries like llama-index. - Seeking a Research Community in AI:
@raghadn3is in search of a community dedicated to writing research papers on Artificial Intelligence.
Links mentioned:
- GitHub - stanfordnlp/dspy: DSPy: The framework for programmingānot promptingāfoundation models: DSPy: The framework for programmingānot promptingāfoundation models - stanfordnlp/dspy
- Introduction to Gorilla LLM: no description found
HuggingFace ā· #cool-finds (9 messagesš„):
-
BitNet b1.58: Efficient LLMs:
@jessjess84highlighted the potential of BitNet b1.58, a new 1-bit Large Language Model that promises efficiency without sacrificing performance, detailed in an arXiv paper. Achieving the same results as full-precision models, it introduces cost-effective latency, memory, throughput, and energy consumption. -
Stable Diffusion Deluxe Debuts:
@skquarkinvited users to try Stable Diffusion Deluxe, an extensive multimedia AI toolkit supporting various AI art generators, boasting features for creating images, videos, sound effects, and more. The platform, detailed at diffusiondeluxe.com, integrates numerous pipelines and is designed for ease of use and creative experimentation. -
Looking for Self-Hosting Details: In response to
@skquarkās all-in-one multimedia AI app,@wolfspyreinquired about self-hosting options, complimenting the project as āsuper coolā and expressing interest in diving deeper. -
Appreciating āThe Hugā:
@evergreenkingshared a link to thehug.xyz, a site described as ājust link art,ā with@wolfspyrefollowing up to ask if it was@evergreenkingās creation.
Links mentioned:
- HUG | A Home for Your Art: Join our global creative community to showcase & sell your art, connect with others, and access creator-friendly grants and education.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- Uncovering the Origins of Values: A Biology and Cognition-Based Approach for AI Alignment: no description found
- Diffusion Deluxe Home - Stable Diffusion Deluxe: no description found
HuggingFace ā· #i-made-this (14 messagesš„):
-
DIY Local LLM Assistant Unveiled:
@rivridisdeveloped a Locally running LLM Assistant with an assistant mode and real-time editing mode for content editing and creation. The code and details are available on GitHub. -
Deploy to Google Cloud Vertex AI Simplified:
@alvarobarttwrote a blog post detailing how to deploy models from the HuggingFace Hub to Google Cloud Vertex AI. You can check out the technical post and its step-by-step guide here. -
Cursor Hero demo v0.3.0:
@teamyis developing a UI tool titled Cursor Hero, with integrations of ollama and whisper. A demo of the tool can be found in this YouTube video. -
Gantrithor: A Data Annotation Leap:
@stroggozannounced an open beta for Gantrithor, a rapid, bulk data annotation tool, with a free version limiting datasets to 1000 documents. Learn more and try it out at Gantrithor. -
Starcoder 2: Code & Learn:
@tonic_1fixed errors in the example code and announced Starcoder 2, available for learning and enjoyment, with a call to collaborate on fine-tuning models. Find the project on HuggingFace Spaces.
Links mentioned:
- MetaMath Mistral Pro - a Hugging Face Space by Tonic: no description found
- Deploying š¤ Hub models in Vertex AI: no description found
- StarCoder2 - a Hugging Face Space by Tonic: no description found
- Qbeastās Adventure in AI-Driven Meme Creation - Qbeast: Learn about AI model selection, fine-tuning, and the role of Qbeast in enhancing meme creativity. Perfect for AI enthusiasts and data engineers seeking insights and innovation.
- Gantrithor: no description found
- Cursor Hero demo v0.3.0: https://github.com/TeamDman/Cursor-Hero.githttps://discord.gg/psHtde64FJ#rust #bevy #windows #win32
- this one rly slaps - episode 16 #music #producer: gonna be hard to beat this one
- GitHub - Rivridis/LLM-Assistant: Locally running LLM with internet access: Locally running LLM with internet access. Contribute to Rivridis/LLM-Assistant development by creating an account on GitHub.
- SDXL-Lightning: quick look and comparison: With SDXL-Lightning you can generate extremely high quality images using a single step.
HuggingFace ā· #diffusion-discussions (5 messages):
- Gradio Queue Function Clarification: User
@akin8941inquired about the return type of thequeue()function in gradio interface, and@iakhilclarified that it does not have a return type of its own. - Too Fast for Comfort:
@HuggingModcautioned@1122120801903194114about posting too quickly in the HuggingFace Discord, asking to slow down a bit with a friendly reminder emoji. - Scheduler Name Puzzle:
@luihisexpressed difficulty in retrieving the string name of a scheduler due to deprecation warnings. Despite attempts using different properties, the correct string, āDPMSolverSinglestepScheduler,ā remained elusive.
HuggingFace ā· #computer-vision (4 messages):
- Parseq Praise: User
@whoami02recommended the use of Parseq for its effective symbol recognition capabilities. - Personalized Fine-tuning Success: They also mentioned successfully fine-tuning the model on their specific dataset, which contained images similar to the equations they needed to detect.
- Resnet Still Rocks: As for the task of detection,
@whoami02asserted that Resnet stands strong and is good enough for their needs. - Slow Your Roll:
@HuggingModadvised@whoami02to slow down their message posting to adhere to the community guidelines.
HuggingFace ā· #NLP (14 messagesš„):
-
Inference Troubles in the Hugging Face Repo:
@alfred6549sought assistance for running the text generation inference repository on a machine without a CPU or CUDA, sharing an error they encountered. Despite attempts to disable GPU usage, the local setup still failed. -
Petals Resonate with Users: User
@ai_noobsimply stated āpetalsā, which received a positive acknowledgment from@nrs9044, indicating a shared sentiment or understanding about the termās context. -
Benchmark Necessities Discussed:
@vipitisstressed the importance of testing on larger benchmarks for validity, while@djpanda1acknowledged the advice but noted that preliminary tests on several prompts appeared successful. -
Financial Document Insight Quest:
@hiteshwarsingh1is exploring ways to extract information from financial documents, considering MapReduce techniques and seeking recommendations for open-source models or approaches suitable for summarization rather than specific information retrieval. -
Improving Information Extraction with LLMs:
@.sgpis utilizing mistral 7b with llamacpp for JSON data extraction and expressed interest in incorporating in-context learning to enhance accuracy, requesting resources on the topic.
Links mentioned:
- deepseek-ai/deepseek-coder-6.7b-instruct Ā· Hugging Face: no description found
- Hugging Face: The AI community building the future. Hugging Face has 196 repositories available. Follow their code on GitHub.
HuggingFace ā· #diffusion-discussions (5 messages):
- Gradioās
queue()Function Clarification:@akin8941asked about the return type of thequeue()function in the Gradio interface, to which@iakhilresponded that it doesnāt have a return type of its own. - Slow Down Warning by HuggingMod: A reminder was given by
HuggingModdirected at<@1122120801903194114>, cautioning them to slow down their message frequency in the channel. - Trouble with Deprecation Notice:
@luihisshared a snippet of code and expressed confusion due to a deprecation warning when trying to get the name of a scheduler as a string; emphasizes uncertainty even after different attempts at printing the schedulerās class name.
LAION ā· #general (314 messagesš„š„):
-
Ideogram Launch Causes Stir:
@pseudoterminalxshared a prompt result from the new AI model by Ideogram, triggering discussions on its prompt adherence and aesthetics. There were comparisons to Stable Diffusion and speculations about the potential poor quality of unseen Imagen samples. -
T5 XXL, CLIP L, and CLIP G in SD3?:
@thejonasbrothersand@devilismyfrienddiscussed the integration of T5 XXL and CLIP models in SD3, hinting at the potential for both accuracy and appealing aesthetics in future models. -
Cascadeās Fidelity Questioned:
@pseudoterminalxand others critically evaluated Cascadeās ability to generate images based on prompts, noting frequent issues with prompt adherence and specificity. -
AI Generated Art and Copyright Battles: Users
@progamergov,@itali4no, and others engaged in conversations about the looming legal challenges around AI-generated art, referencing recent cases and the ambivalent approach of Huggingface towards DMCA requests. -
Stability AIās Silent Many Projects:
@.undeletedexpressed confusion over the multiplicity of projects with similar goals at Stability AI, each announced similarly but with unclear differences.
Links mentioned:
- Release v0.9.1 - DoRA the explorah Ā· bghira/SimpleTuner: This release has some breaking changes for users who: Use RESOLUTION_TYPE=area (resolution_type=area for multidatabackend config) Use crop=false Use crop=true and crop_aspect=preserve as the precā¦
- panopstor/nvflickritw-cogvlm-captions Ā· Datasets at Hugging Face: no description found
- Willys Chocolate Experience Glasgow. Get your Tickets!: INDULGE IN A CHOCOLATE FANTASY LIKE NEVER BEFORE - CAPTURE THE ENCHANTMENT! Tickets to Willys Chocolate Experience are on sale now! at the willys chocolate experience in Glasgow! Tickets to Willys Chā¦
- China issues worldās 1st legally binding verdict on copyright infringement of AI-generated images - Global Times: no description found
- Copyright Safety for Generative AI | Published in Houston Law Review: By Matthew Sag. 61 Hous. L. Rev. 295 (2023)
LAION ā· #research (48 messagesš„):
-
Spiking Neural Network Speculations:
@max_voltagewonders if advancements might lead to a reintroduction of spiking neural networks, proposing time dithering as a technique to enhance precision.@spirit_from_germanyagrees, reminded of spiking networks by the concept. -
Contemplating Low Information Density in Models:
@max_voltageexpresses surprise at the ability to lower information to 1-2 bits per weight in models, indicating a low info density in current models.@thejonasbrothersexplained this is possible due to the innate sparsity of existing networks, while some weights could be even 1-bit or 0-bit. -
New AI Image Generator Buzz:
@vrus0188shares a Reddit post about a new AI image generator thatās reportedly 8 times faster than OpenAIās best tool and can run on modest computers.@spirit_from_germanyprovides a link to the KOALA image generator site for quality testing without cherry-picking. -
EMO: Creating Expressive Portrait Videos: The EMO project is highlighted by
@helium__, presenting a new audio-driven portrait-video generation method.@itali4noremarks on the same authors as the animate anyone paper, indicating a likely absence of released code. -
AI Icon Generation Model Release:
@kopylannounces the release of a state-of-the-art AI model for icon generation, trained with a personal investment of $2000, available via Hugging Face.@chad_in_the_housepraises the modelās low noise, although@kopyladvises that it only generates images at 256px resolution. -
Language Model Distillation Learning Inquiry:
@jh0482seeks information on distillation learning specifically for embedding language models, discussing concerns related to continuous space targets.@itali4nosuggests standard distillation methods might apply, but@jh0482considers regression towards the target and contrastive learning as potential methods.
Links mentioned:
- KOALA: Self-Attention Matters in Knowledge Distillation of Latent Diffusion Models for Memory-Efficient and Fast Image Synthesis: SOCIAL MEDIA DESCRIPTION TAG TAG
- Elucidating the Design Space of Diffusion-Based Generative Models: We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates tā¦
- EMO: EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
- Samsung Develops Industry-First 36GB HBM3E 12H DRAM: Samsungās HBM3E 12H achieves industryās largest capacity HBM with groundbreaking 12-layer stack, raising both performance and capacity by more than 50% Advanced TC NCF technology enhances vertical deā¦
- Reddit - Dive into anything: no description found
- GitHub - collabora/WhisperSpeech: An Open Source text-to-speech system built by inverting Whisper.: An Open Source text-to-speech system built by inverting Whisper. - collabora/WhisperSpeech
- kopyl/ui-icons-256 Ā· Hugging Face: no description found
- UI icons - v1.0 | Stable Diffusion Checkpoint | Civitai: SOTA model for generating icons. Motivation: I spent $2000 of my own money to train this model. I was unable to monetize it, so Iām sharing it withā¦
Nous Research AI ā· #off-topic (21 messagesš„):
- Emoji Reacts Tell a Story:
@leontelloand@0xevilemployed emotive emojis, with the former using a salute emoji (<:o7:1151260455218708480>) and the latter a skull emoji (<:dead:1072635189274083409>), reflecting a sense of conclusion or death, followed by a crying face (<:f_cry:1159653986681499768>) in response to the absence of GPT-5. - Anticipating Future GPT iterations: Conversation by
@0xevilhighlighted the communityās anticipation for future GPT versions, mentioning non-existent GPT-6 and responding humorously to@error.pdfās mention of GPT-9 with a surprised emoji (<:ooo:1133962720232865843>). - Monitor and Dock Recommendations:
@denovichshared a YouTube video reviewing Dellās new 5K monitor and suggested that Dell offers monitors that can connect to multiple machines simultaneously, while mentioning that their docking stations and a specific model, the Dell Thunderbolt Dock WD22TB4, are worth considering and can be found on eBay. - Anticipations on Y Combinatorās Batch Focus:
@0xevilpondered whether Y Combinatorās latest batch predominantly featured companies offering GPT-wrapper services, observing similarities with existing products and innovations in areas like transcription and code generation from design. - Speculations and Shared Resources Surrounding GPT Patents and Applications:
@0xevilmulled over the GPT-6 patent possibly discussed in broader circles and noted the integration of AI agents with music generation, while@pradeep1148shared a YouTube video demonstrating how to fine-tune the Gemma model using Unsloth.
Links mentioned:
- Oppenheimer Oppenheimer Movie GIF - Oppenheimer Oppenheimer movie Oppenheimer explosions - Discover & Share GIFs: Click to view the GIF
- Finetune Gemma 7B with Unsloth: We will take a look at how to finetune Gemma model using unslothhttps://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing#scrollTā¦
- One Month with the Best Monitor in the World: The New Dell 40ā 5K120 HDR U4025QW: Dave spends a month with the brand new Dell 5K120 HDR monitor. For my book on life on the Spectrum: https://amzn.to/49sCbbJFollow me on Facebook at http://fā¦
Nous Research AI ā· #interesting-links (6 messages):
-
1-bit Revolution in LLMs:
@deki04shared an arXiv paper introducing BitNet b1.58, a new 1-bit Large Language Model that achieves comparable performance to full-precision models while being more cost-effective. The model presents a ānew scaling lawā for designing high-performance, yet cost-efficient LLMs. -
Curiosity Piqued by BitNet:
@deki04expressed surprise about the existence of 1-bit LLMs, not having encountered this concept before. -
Scaling Laws Under the Microscope:
@sherlockzoozoocommented that multiplicative scaling laws are interesting, presumably in the context of the 1-bit LLM, and noted that additive scaling doesnāt perform well with increasing model size. -
New LLM Benchmark Released:
@tarrudashared a link to Nicholas Carliniās benchmark for Large Language Models, highlighting its unique tests that include a range of complex tasks and the use of a dataflow domain specific language for easy test additions. -
Benchmark Results on Mistral vs GPT-4: Following the benchmark share,
@tarrudamentioned a YouTube video where someone tested the benchmark on various models, including some 7B models like Mistral and GPT-4.
Links mentioned:
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- My benchmark for large language models : no description found
- Mistral Large vs GPT4 - Practical Benchmarking!: ā”ļø One-click Fine-tuning & Inference Templates: https://github.com/TrelisResearch/one-click-llms/ā”ļø Trelis Function-calling Models (incl. OpenChat 3.5): httpā¦
Nous Research AI ā· #general (205 messagesš„š„):
- Ragtag Ruminations on RAG:
@natefyi_30842discussed the use of an LLM to create Q&A pairs that are then fine-tuned and combined with RAG for better context understanding. - Issues with Service Providers and Fine-Tuning:
@tekniumcommented that fine-tuning providers are facing issues due to conflicts between fine-tune mixing and scaled inference code, making local GGUF setups the only reliable option currently. - Troubles with Gemini 2B Fine-Tuning:
@lmmintasked the community if anyone was successful in fine-tuning the Gemini 2B and mentioned high-quality data as a requirement. - CausalLMās Impressive MMLU Score:
@nonameusrexpressed surprise at CausalLMās high MMLU benchmark and shared a link provided by@giftedgummybeeto the Hugging Face model CausalLM/34B-preview. - Excitement Around the Release of HyenaDNA: Discussions surrounding Stanfordās introduction of HyenaDNAālong-range genomic model with 1 million token capacityāgenerated buzz, with
@euclaisesuggesting āfill in the middleā (FIM) might be suitable for DNA sequences over autoregressive models.
Links mentioned:
- Tweet from undefined: no description found
- HyenaDNA: learning from DNA with 1 Million token context: HyenaDNA is a long genomic sequence model trained on the Human Reference Genome with context length of up to 1 million tokens.
- CausalLM/34B-preview Ā· Hugging Face: no description found
- qualcomm (Qualcomm): no description found
- Embedding - GPT4All Documentation: no description found
- OpenAI Five defeats Dota 2 world champions: OpenAI Five is the first AI to beat the world champions in an esports game, having won two back-to-back games versus the world champion Dota 2 team,Ā OG, atĀ FinalsĀ this weekend. Both OpenAI Five and Deā¦
- Tweet from TechCrunch (@TechCrunch): Tim Cook says Apple will ābreak new groundā in GenAI this year https://tcrn.ch/3Ig8TAX
- UniProt: no description found
- sordonia (Alessandro Sordoni): no description found
- supertrainer2000/supertrainer2k/optim/adalite.py at master Ā· euclaise/supertrainer2000: Contribute to euclaise/supertrainer2000 development by creating an account on GitHub.
- GitHub - nestordemeure/question_extractor: Generate question/answer training pairs out of raw text.: Generate question/answer training pairs out of raw text. - nestordemeure/question_extractor
- BAAI/bge-base-en-v1.5 Ā· Hugging Face: no description found
- Models: Remove system prompt of Nous-Hermes-2-Mistral-7b-DPO by ThiloteE Ā· Pull Request #2054 Ā· nomic-ai/gpt4all): Describe your changes Adds āaccepts various system promptsā Removes system prompt fix whitespace Checklist before requesting a review I have performed a self-review of my code. If it isā¦
- CausalLM/34b-beta Ā· Hugging Face: no description found
- Models: Remove system prompt of Nous-Hermes-2-Mistral-7b-DPO by ThiloteE Ā· Pull Request #2054 Ā· nomic-ai/gpt4all: Describe your changes Adds āaccepts various system promptsā Removes system prompt fix whitespace Checklist before requesting a review I have performed a self-review of my code. If it isā¦
Nous Research AI ā· #ask-about-llms (45 messagesš„):
-
Seeking GPT-4 level on a budget:
@natefyi_30842sought a cheaper alternative to GPT-4 that can prevent the inclusion of provided subsequent book chunks in its responses, findingMixtral Instructto work fairly well despite its limitations. The conversation suggests that only GPT-4 behaves as desired in this context. -
Fine-tuning a question of quantity: Discussing the significance of the training dataset size,
@natefyi_30842wondered if a hundred entries would suffice as opposed to millions, and@tekniumsuccinctly replied with ā5kā. -
DPO tactics in model training discussed: In pursuit of improving model answers,
@natefyi_30842considered generating wrong examples for Directed Prompt Optimization (DPO), meanwhile, users discussed when DPO might be more effective. -
Choosing separators for text manipulation:
@natefyi_30842pondered the efficacy of using standard or unique tokens as separators, such as emojis vs.%XYZ%, for adding elements to text in model inputs;@natefyi_30842shared a link to a tokenizer for context. -
Interpretability and engineering representations: Max_paperclips discussed the exciting field of representations engineering, citing a favorite post and referring to work such as Representation Engineering: A Top-Down Approach to AI Transparency and the corresponding Github code for the paper.
Links mentioned:
-
Bowing Thank You GIF - Bowing Thank You Tom And Jerry - Discover & Share GIFs: Click to view the GIF
-
[
Representation Engineering Mistral-7B an Acid Trip](https://vgel.me/posts/representation-engineering/): no description found
-
Metas Llama 3 is set to release in July and could be twice the size: Metaās next open-source language model, Llama 3, is scheduled for release in July and is intended to be on par with GPT-4.
Nous Research AI ā· #project-obsidian (3 messages):
Hereās the summary based on the messages provided:
- QT Node-X Twitter Updates: QT Node-Xās Twitter shared a series of posts QT Node-X Tweet 1, QT Node-X Tweet 2, and QT Node-X Tweet 3, though the content of the tweets was not provided in the messages.
Latent Space ā· #ai-general-chat (57 messagesš„š„):
- Noam Shazeerās Blog Debut:
@swyxioshared the first blog post by Noam Shazeer, discussing coding style, titled Shape Suffixes: Good Coding Style. - Customer Satisfaction and LLMs:
@eugeneyanexpressed appreciation for a data point indicating that LLMs are on par with humans in customer service satisfaction and can handle two-thirds of customer service queries. - Skepticism on AI News:
@swyxioflagged an overhyped news piece, suggesting skepticism when something seems too good, referencing the Klarna AI assistant story on Fast Company. - Discussion on LLM Paper Club:
@swyxioalerted users to a special Matryoshka Embeddings presentation, while@osansevieroand@swyxioreferenced additional materials on this topic, including a blog post on HuggingFace and a YouTube channel with simplified LLM technique explanations. - Insights on Lakehouses and Data Engineering: In response to
@quicknick123seeking resources on lakehouses,@swyxiorecommended an in-depth guide on table formats, query engines, and the utility of Spark published by Airbyte.
Links mentioned:
- no title found: no description found
- Tweet from Noam Shazeer (@NoamShazeer): https://medium.com/@NoamShazeer/shape-suffixes-good-coding-style-f836e72e24fd Check out my first blog post.
- Matryoshka Representation Learning: Learned representations are a central component in modern ML systems, serving a multitude of downstream tasks. When training such representations, it is often the case that computational and statisticā¦
- Tweet from murat š„ (@mayfer): wow, highly recommend checking out all the samples: https://humanaigc.github.io/emote-portrait-alive/ āļø Quoting AK (@_akhaliq) Alibaba presents EMO: Emote Portrait Alive Generating Expressive Porā¦
- Conviction : no description found
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- Efficient NLP: Efficient NLP Consulting My name is Bai Li, Iām a machine learning engineer and PhD in natural language processing. I can help you build cost-effective and efficient NLP systems. Reach me at: Emā¦
- Data Lake / Lakehouse Guide: Powered by Data Lake Table Formats (Delta Lake, Iceberg, Hudi) | Airbyte: Explains the open-source data lakes and their power with data lake table formats. Whatās the difference between a lakehouse and when you need one.
- Tweet from Hamel Husain (@HamelHusain): Something smells really wrong about the Klarna news itās a bit too much made for TV? https://www.fastcompany.com/91039401/klarna-ai-virtual-assistant-does-the-work-of-700-humans-after-layoffs
- Tweet from Rowan Cheung (@rowancheung): Itās been a huge day for AI with announcements from Alibaba, Lightricks, Ideogram, Apple, Adobe, OpenAI, and more. The 7 most important developments that happened: 1. Alibaba researchers unveileā¦
- šŖ Introduction to Matryoshka Embedding Models: no description found
- Jonathan Ross at Web Summit Qatar: Groq CEO & Founder, Jonathan Ross, on Center Stage at #WebSummitQatar2024, discussing how to make AI Real.X (fka Twitter): @WebSummitQatarInstagram: @WebSummā¦
Latent Space ā· #ai-announcements (3 messages):
- Replicate CEO in the Podcast Spotlight:
@swyxioannounced the release of a new podcast episode featuring the CEO of Replicate. The tweet with the link to the episode can be found here. - MRL Embeddings Paper Club Meeting:
@swyxiogave a heads-up about an upcoming event led by<@206404469263433728>in the#1107320650961518663channel, where the authors of the MRL embeddings paper will be present. The event cover can be viewed here. - Deep Dive into Representation Engineering:
@ivanleomkflagged an upcoming session with<@796917146000424970>on Representation Engineering 101 in the#1107320650961518663channel, inviting members to participate and engage with questions.
Links mentioned:
LLM Paper Club (West Edition!) Ā· Luma: This week weāll be covering the paper - Matryoshka Representation Learning ( https://arxiv.org/abs/2205.13147 ) with two of the co-authors Gantavya Bhatt and Aniket Rege. We have movedā¦
Latent Space ā· #llm-paper-club-west (165 messagesš„š„):
-
Matryoshka Dolls Embrace AI: User
@akusupatishared the paper titled āMatryoshka Representation Learningā and discussed its potential for creating LLM embeddings with adaptive dimensions. Itās a technique that could offer varying levels of abstraction, potentially saving on compute and storage. -
Making sense of MRL:
@swyxioand others engaged in a discussion trying to grasp the quirks of Matryoshka Representation Learning (MRL), including insightful comparisons to PCA on embeddings and how this technique involves adding the loss of models at varying dimensions for optimized learning. -
Deployment Insights and Applications: Participants like
@ivanleomkand@gulo0001offered practical information and demonstrations of embedding models incorporating MRL. They discussed adaptations and provided resources like a Supabase blog and HuggingFace blog that help understand the real-world use of these models. -
Curiosity Reigns in Matryoshka Exploration:
@punnicat, presumably one of the authors, was present to field questions and clarify concepts around Matryoshka Embeddings, especially concerning dimensionality and the granularity of embeddings during training and their implications for models. -
Engagement with Authors and Resources: The session marked a presence of curious minds asking questions about Matryoshka Embeddings and the broader implications for transformer models with users like
@swyxioand@cakecrusherdiscussing potential applications and improvements. The authors were open to sharing slides and further details like@punnicatwho can be contacted on Twitter.
Links mentioned:
-
Matryoshka Representation Learning (MRL) from the Ground Up | Aniket Rege: no description found
-
Nextra: the next docs builder: Nextra: the next docs builder
-
MatFormer: Nested Transformer for Elastic Inference: Transformer models are deployed in a wide range of settings, from multi-accelerator clusters to standalone mobile phones. The diverse inference constraints in these scenarios necessitate practitionersā¦
-
[
Representation Engineering Mistral-7B an Acid Trip](https://vgel.me/posts/representation-engineering/#How_do_we_make_one?_Is_it_hard?ā): no description found
-
Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval: Use Adaptive Retrieval to improve query performance with OpenAIās new embedding models
-
Matrioska Loop GIF - Matrioska Loop Bored - Discover & Share GIFs: Click to view the GIF
-
AdANNS: A Framework for Adaptive Semantic Search: Web-scale search systems learn an encoder to embed a given query which is then hooked into an approximate nearest neighbor search (ANNS) pipeline to retrieve similar data points. To accurately captureā¦
-
šŖ Introduction to Matryoshka Embedding Models: no description found
-
NeuML/pubmedbert-base-embeddings-matryoshka Ā· Hugging Face: no description found
-
Representation Engineering 101: no description found
Perplexity AI ā· #general (157 messagesš„š„):
-
Activation Woes for Rabbit R1 Promo: User
@mithrilmanrequired assistance activating the Rabbit R1 promo.@icelavamanprovided step-by-step instructions, emphasizing the need to use the email link, and suggested contacting support for further help, especially since the email button appeared bugged and non-clickable. -
Podcast Curiosities and Clarity:
@_paradroidraised a question about podcasts posting under the name āPerplexity AI,ā prompting@icelavamanto clarify the official podcast link while@ok.alexstated that unauthorized use of the Perplexity AI name is likely for attention or money. -
Understanding AI Model Preferences: New user
@outrerimasked about strengths and weaknesses of different AI models, and@jaicraftoutlined core use-cases for Experimental, GPT-4 Turbo, Claude, and Mistral models, though opinions differed with users like.claidlerandnaivecoder786favoring Mistral for code queries. -
Discussing Perplexityās Capabilities and Limitations:
@brknclock1215described Perplexityās AI as excellent for internet-based information handling and answering questions rapidly, but highlighted its limitations such as parsing large files and image generation, understanding itās less optimized for such tasks. -
Concerns and Solutions for Perplexity Service Issues: Users
@stevvieand@dv8sencountered confusions regarding the absence of file upload options and name changes from āCopilotā to āPro,ā while@moyaoasissuggested the addition of a feature for exporting Perplexity thread responses, a function not yet available but considered for future implementation.
Links mentioned:
- Tweet from Perplexity (@perplexity_ai): More on Mistral Large šhttps://www.perplexity.ai/search/Mistral-Large-Overview-Fw.QrWxvR9e9NRuDxB1wzQ
- āDiscover Daily by Perplexity on Apple Podcasts: āNews Ā· 2024
- āPerplexity AI on Apple Podcasts: āNews Ā· 2024
- āStuff You Should Know About AI on Apple Podcasts: āBusiness Ā· 2024
Perplexity AI ā· #sharing (13 messagesš„):
- Librem5 Explores BurpSuite Community Edition:
@librem5shared a Perplexity link examining the differences between BurpSuite Community Edition and an unspecified alternative. - Muscle Building Plan crafted by AI:
@commuting5048requested a muscle-building plan optimized with a focus on protecting arms from over-fatigue, and shared the resulting Perplexity search. They expressed satisfaction with GPT-4ās detailed workout including sets and reps. - Ourdigital Investigates Digital Analytics with Perplexity:
@ourdigitalutilized Perplexity to gather and organize information for digital analytics and performance marketing, sharing his findings in a Perplexity link. - Exploring Mistralās Capabilities: Several users, including
@manbearpig86,@rhysd21, and@dailyfocus_daily, were looking into comparisons between Mistral and other models like ChatGPT, as reflected in their shared Perplexity search links, another comparison, and a Starcoder announcement. - Podcast Prompt Crafting and AI Future Discussions:
@_paradroidshared a Perplexity link for crafting a podcast prompt for ā48 Hours of AIā and another link discussing Russiaās preparation for future challenges, likely with AI, using a ResearchGPT prompt (ResearchGPT prompt link).
Perplexity AI ā· #pplx-api (28 messagesš„):
-
Glitch Hunt in Text Generation:
@thedigitalcatpointed out that glitches often occur when the system attempts to generate source information during text production. Other users like@brknclock1215and@clay_fergusoncontributed to the discussion, suggesting that the issue could relate to the implementation of sources and the inference layerās approach. -
Sonnar Mediumās Weather Query Passion:
@brknclock1215humorously continued to test sonar-medium-online with weather-related queries, reporting inconsistent behaviors related to the retrieval system and making observations about the presence of āresponsiveā elements in system messages. -
The Nostalgia for pplx-70b: Amidst discussions on model performance,
@thedigitalcathumorously suggested that everyone will eventually agree that pplx-70b was superior to sonar models, with@lazysuckerexpressing agreement. -
The API Conundrum:
@jeffworthingtonencountered an error when using an OpenAPI definition from the provided documentation and queried whether a newer version should be referenced, indicating potential issues with the existing API definitions. -
Seeking Perplexityās API for Voice Chat:
@tom_primozicinquired about using Perplexity AIās functionality through an API for a voice chat application, noting discrepancies in response quality between the website andsonar-medium-onlinemodel.
Links mentioned:
Getting Started with pplx-api: You can access pplx-api using HTTPS requests. Authenticating involves the following steps:Start by visiting the Perplexity API Settings page. Register your credit card to get started. This step will nā¦
Eleuther ā· #announcements (1 messages):
- Launch of Foundation Model Development Cheatsheet:
@hailey_schoelkopfannounced the release of The Foundation Model Development Cheatsheet, a resource to assist new open model developers. The cheatsheet was a collaborative effort featuring contributors from EleutherAI, MIT, AI2, Hugging Face, and other institutions, aiming to provide an overview of resources for responsible open model development. - The Cheatsheet Champions Open Model Pioneers: Highlighting the importance of open model development,
@hailey_schoelkopfpointed out the release of fully transparent models such as the Pythia model suite by EleutherAI, Amber by the LLM360 project, and AI2ās OLMo, emphasizing the growth of openly available models since April 2023. - Focus on Dataset Documentation and Licensing: The new resource focuses on important and underdiscussed areas in model development like dataset documentation and licensing practices, which are crucial for creating open models.
- Where to Find the Cheatsheet: The Foundation Model Development Cheatsheet can be accessed as a PDF paper or viewed as an interactive website. Updates and additional context are available in their blog post and Twitter thread.
Eleuther ā· #general (34 messagesš„):
-
Seeking Cross-Attention SSM Model:
@_michaelshinquired about models with cross-attention similar to BERT for sequence classification;@stellaathenasuggested models could be trained as encoders and later mentioned StripedHyena, which alternates attention and SSM layers.@frazermcfavoredadaLN0withmamba, and although there wasnāt a pretrained mamba for sequence classification readily available, it was suggested that one could train a classification head on an existing checkpoint. -
Stable Video Diffusion Inquiry:
@clashlukewas looking for guidance on how to train/fine-tune the stable video diffusion model, looking to retain its v-prediction while noting it usesEulerDiscretewithout aget_velocityfunction for training. -
Understanding lm-evaluation-harness: Several users, including
@slowturtle_p,@hailey_schoelkopf, and@maya_liv, discussed nuances of the lm-evaluation-harness evaluation tool, including score normalization, model substitution with custom code, and potential TensorRT support.@stellaathenaprovided a link to a blog post for further clarification on multiple-choice normalization. -
EleutherAI Pythia Model Status: Question from
@mistobaanabout the status of EleutherAI/pythia-13m model, to which@catboy_slim_clarified it is still available if referring to the 14m variant. -
Various Discussion and Announcements: Users like
@canadagoose1shared logistical challenges and announcements about talks,@gaindrewhighlighted an abstract of a research paper introducing a 1-bit Large Language Model,@tastybucketofriceand@hailey_schoelkopfcelebrated user engagement with specific datasets, and@ilovesciencenoted automated downloads likely from usinglm-eval-harness.
Links mentioned:
- Multiple Choice Normalization in LM Evaluation: There are multiple ways of evaluating multiple choice tasks on autoregressive LMs like GPT-3/Neo/J. This post lays out the current prevalent normalization methods.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- Oogway Master Oogway GIF - Oogway Master Oogway Kung Fu Panda - Discover & Share GIFs: Click to view the GIF
- Meet: Real-time meetings by Google. Using your browser, share your video, desktop, and presentations with teammates and customers.
- Issues Ā· EleutherAI/lm-evaluation-harness): A framework for few-shot evaluation of language models. - Issues Ā· EleutherAI/lm-evaluation-harness
Eleuther ā· #research (63 messagesš„š„):
-
Open Source Models Galore: @maxmatical shared a Twitter link to some open-sourced models with accompanying data, posting a tweet from BigCodeProject.
-
Pretraining Token Queries: In a discussion initiated by @leegao_ about the pretraining token-to-model size ratio, @stellaathena clarified, āThere are no rules,ā regarding the expectations of tokens for pretraining models. @maxmatical provided a link to a paper on arXiv discussing pretraining with constrained data.
-
Navigating Mazes with Diffusion Models: @.the_alt_man highlighted a diffusion model trained to solve mazes, sharing tweets from @francoisfleuret and @ArnaudPannatier. @uwu1468548483828484 also chimed in, relating it to prior work on solving mazes with variable depth neural networks.
-
Prompt Engineering Transferability Discourse: @thatspysaspy asked if thereās been study on prompt engineering transfer from small to big models; @catboy_slim_ replied with personal experiences, noting that while generic engineering transfers reasonably well, complex instructions tend to be tightly coupled with specific models. A systematic study with statistical measures seems to be an untapped area.
-
The Challenges of Sub 8 Bit Quantization: A series of messages from @kd90138 and @clock.work_ expressed skepticism about the practicality and scaling potential of 1-bit Large Language Models given current hardware trends and geopolitical concerns impacting chip manufacturing.
Links mentioned:
- Stable LM 2 1.6B Technical Report: We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instrucā¦
- Language Modeling by Estimating the Ratios of the Data Distribution | Aaron Lou: no description found
- Scaling Data-Constrained Language Models: The current trend of scaling language models involves increasing both parameter count and training dataset size. Extrapolating this trend suggests that training dataset size may soon be limited by theā¦
- LeoLM: Igniting German-Language LLM Research | LAION: <p>We proudly introduce LeoLM (<strong>L</strong>inguistically <strong>E</strong>nhanced <strong>O</strong>pen <strong>L</strong>anguage <stronā¦
- Tweet from FranƧois Fleuret (@francoisfleuret): We train a discrete diffusion denoising model to find paths in a maze. The visualization of the evolution of x_0|x_t (last message in the thread) is very cool IMO. āļø Quoting Arnaud Pannatier (@Arnauā¦
Eleuther ā· #scaling-laws (3 messages):
- Inquiring About Animation Creation:
@.the_alt_manasked how a certain animation was made, expressing curiosity about the method or tool used. imageiofor GIFs: In response,@kyo_takanomentioned thatimageiowas used to create the GIF animation.@.the_alt_manfollowed up for confirmation to clarify that the animation was indeed created withimageio.
Eleuther ā· #interpretability-general (15 messagesš„):
- Matrix Norms and Products Simplified:
@wendlercexplained that matrix vector & matrix matrix products, as well as matrix norms, are shorthand for computing and summing up important cosines. The matrix-2-norm is specifically the matrix norm associated with the vector 2-norm. - Decoding Details in RMSNorm Implementation:
@wendlercclarified a subtle detail that their paper does not explicitly mention: the final decoding step involves an RMSNorm layer application tohbefore matrix multiplication. They described a computational split of this process for ease in cosine calculations between resulting expressions. - Unpacking the Tuned Lens Decoding Process:
@wendlercand@mrgonaodiscussed the mechanism of decoding using a tuned lens in neural networks. They considered whetherlogits = U RMSNormlayer(tunedlens(h))accurately represents the tuned lensās activity. - Implementation Nuances of Tuned Lens and Notation: Throughout the conversation,
@wendlercaddressed the practical aspects of porting their implementation to consider the tuned lensās effect, highlighting the necessity of substitutinghwithtunedlens(h). - Understanding Matrix Norm Terminology:
@norabelroseclarified the terminology around matrix norms, stating that the Frobenius norm relates to the Euclidean norm of the matrix when flattened, whereas the ā2-normā of a matrix refers to its spectral norm or top singular value.
Eleuther ā· #lm-thunderdome (19 messagesš„):
-
Tinkering with LM Eval Harness:
@paganpegasusinquired about integrating instruction/chat formatting into the LM Eval harness or considering finetuning on examples with existing eval harness formatting. -
Custom Model Modification for Hallucination Leaderboard:
@pminervinishared a snippet of code from their approach to incorporate chat templates into the LM Eval harness for the hallucinations leaderboard, by extending theHFLMclass. -
Awaiting Progress on Proposed Modifications:
@asugliaupdated@981242445696221224on the status of modifications being identified for a project, noting other tasks had taken precedence. -
Improving Multilingual Lambada Translations:
@hailey_schoelkopfmentioned that@946388490579484732contributed new, higher-quality translations to replace poor quality ones, and the changes will be integrated into the eval harness. The updated dataset includes additional languages, and is available on Hugging Face. -
Implementing EQ-Bench:
@pbevan1sought advice on implementing EQ-Bench, a benchmark for emotional intelligence in language models, especially tasks that handle multiple answers for a single prompt.@hailey_schoelkopfpointed to the Truthfulqa_mc2 task as an example.
Links mentioned:
- src/backend/huggingface_generate_until.py Ā· hallucinations-leaderboard/leaderboard at main: no description found
- GitHub - EQ-bench/EQ-Bench: A benchmark for emotional intelligence in large language models: A benchmark for emotional intelligence in large language models - EQ-bench/EQ-Bench
- marcob/lambada_multilingual Ā· Datasets at Hugging Face: no description found
Eleuther ā· #multimodal-general (2 messages):
- Choosing Between Encoder-Decoder and Decoder-Only Models: User
@jerry0478inquired about when to use cross-attention conditioning as seen in encoder-decoder models compared to embedding tokens in input for decoder-only models. - Flamingo vs. LLaMA Architecture Decisions:
@jerry0478contrasted āllama-styleā architectures with āflamingo-styleā ones, probing the community on intuition for optimal application scenarios of each.
Eleuther ā· #gpt-neox-dev (2 messages):
- Inquiring about Neox and Slurm:
@muwndasked for the recommended method to run Neox with Slurm and Containers, suspecting that--launcher_argsmight be the way but noted it seems unavailable in Neox. - Tip on Neox Infrastructure:
@triggerhappygandhiclarified that Neox does not assume any specifics about the infrastructure, and containers need to be set up in advance. A slurm script exists for using Slurm to run Neox on multinode.
LangChain AI ā· #general (89 messagesš„š„):
-
Seeking Confidence Score Insight: User
@ritanshooinquired about checking the confidence score when using LangChain.js for RAG. Kapa.ai did not have an immediate answer but referred to the LangChain documentation (https://js.langchain.com/docs/get_started) for further exploration. -
Contemplating Memory Integration with LCEL: Both
@marknicholasand@pcube__discussed different aspects of LangChain usage.@marknicholaswanted to add memory to LCEL, and@pcube__inquired about which language integrates best with LangChain for a server using azure hosted LLM as an API endpoint. Kapa.ai suggested consulting official documentation or reaching out to the community for specific guidance. -
Handling Tool Exceptions in Custom Applications:
@abinandanrequested a way to retry a tool ifToolExceptionis thrown when using a custom tool. Kapa.ai highlighted workarounds from LangChainās GitHub discussions and encouraged checking LangChainās GitHub issues for more streamlined solutions (https://github.com/langchain-ai/langchain/issues/10714). -
Using Shopify as an Automated Agent/Tool: User
@erikk4sought automation solutions for customer support tasks related to Shopify, such as checking order statuses or canceling orders. They considered āfront deskā agents routing issues to specific tools and queried the community for tools beyond LangChain that might facilitate this process. -
Deployment Issues and Adding Functionality with LangChain: Users conveyed challenges with LangChainās deployment and functionality.
@hanumantgarad_25732experienced anAttributeErrorwhen usingSQLDatabase.from_databricksoutside a Databricks notebook.@kamakshi08asked about using the JSON parser with LLaMA from Ollama, wondering how it integrates with multimodal models.
Links mentioned:
- no title found): no description found
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Revolutionizing AI Interactions: Integrating Function Calling with Mistral: Introduction
- Querying a SQL DB | š¦ļøš Langchain: We can replicate our SQLDatabaseChain with Runnables.
- JSON parser | š¦ļøš Langchain: This output parser allows users to specify an arbitrary JSON schema and
- Docusaurus | š¦ļøš Langchain): Docusaurus is a static-site generator which
- Custom Agent Class fails with object has no attribute āis_single_inputā Ā· Issue #18292 Ā· langchain-ai/langchain: Checked other resources I added a very descriptive title to this issue. I searched the LangChain documentation with the integrated search. I used the GitHub search to find a similar question and diā¦
- Groq: Insanely Fast Inference š | Worldās First Language Processing Unit (LPU): In this video, I will explain about Groq who introduced Worldās first Language Processing Unit (LPU) designed for AI applications (LLMs). I will show you howā¦
- Deployment | š¦ļøš Langchain).): In todayās fast-paced technological landscape, the use of Large Language Models (LLMs) is rapidly expanding. As a result, it is crucial for developers to understand how to effectively deploy thesā¦
- langchainjs/langchain/src/retrievers/score_threshold.ts at e24d2dedbe7ff93db33a5809e604143d60113028 Ā· langchain-ai/langchainjs): š¦š Build context-aware reasoning applications š¦š. Contribute to langchain-ai/langchainjs development by creating an account on GitHub.
- Issues Ā· langchain-ai/langchain.): š¦š Build context-aware reasoning applications. Contribute to langchain-ai/langchain development by creating an account on GitHub.
- Issues Ā· langchain-ai/langchain)): š¦š Build context-aware reasoning applications. Contribute to langchain-ai/langchain development by creating an account on GitHub.
- GenAI Summit San Francisco 2024: This summit is an extraordinary convergence of the brightest minds in Generative AI, encapsulating the spirit of the future. #AI_ARE_ALL
LangChain AI ā· #langserve (3 messages):
- LangServe Agent Troubles:
@thatdcreported an issue where their agent is not returning the intermediate steps of execution when using langserve; however, it works fine when invoking directly from the agent class. They deduced the problem might be with the API server setup by langserve. - Deep Dive into the Tech Snag:
@thatdcbelieves to have found the problem in theRemoteRunnableobject where the_decode_responsemethod seems to lose the intermediate steps by executingserializer.loadd(obj["output"]). Theyāre in search of a workaround for this issue.
LangChain AI ā· #langchain-templates (2 messages):
- Invitation to Join the Discord Party:
@davisson0429posted a Discord invite link for users to join, accompanied by a lengthy series of separator characters. - Seeking Python Template Wisdom:
@tigermuskinquired about generating a template in Python code that resembles the one found at Smith LangChain Chat JSON Hub.
Links mentioned:
- LangSmith: no description found
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
LangChain AI ā· #share-your-work (4 messages):
-
āLangChain in your Pocketā Hits the Shelves: User
@mehulgupta7991celebrated the listing of their debut book āLangChain in your Pocketā under Googleās Best books on LangChain. -
Flood of Discord Invites:
@davisson0429shared an invite link to a Discord server with a string of obscured characters following the URL, and an @everyone tag, possibly indicating a call to join. -
Calling All Learners: User
@silvermango9927shared a Google Form link soliciting feedback on interest in various topics such as Machine Learning, Data Science, and Web Development, as part of a validation process for a project they are considering. -
Voices of the Future:
@beaudjangointroduced āPablo,ā an AI Voice Chat app that supports multiple LLMs and voices without the need for typing, inviting beta testers to join with an offer for free AI credits. They mentioned looking for engineers willing to join their team using LangChain.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Join the Pablo - AI Voice Chat beta: Available on iOS
- Product Idea Validation Form: Hi, thank you so much for filling in this form and giving a response. The idea : Creating a lab (course) that teaches in a project-based manner compared to all of the conventional longer video-heavyā¦
LangChain AI ā· #tutorials (4 messages):
- Question on LangGraph Capabilities: User
@tigermuskinquired whetherworkflow.compile()is a runnable object in LangGraph. - Spam Alert:
@davisson0429posted an unrelated and spammy invite link to an external Discord server filled with severe text repetition. - Groqās LPU Breakthrough Showcased:
@datasciencebasicsshared a YouTube video titled āGroq: Insanely Fast Inference š | Worldās First Language Processing Unit (LPU)ā highlighting the introduction of the worldās first Language Processing Unit designed for AI applications, showcasing its potential for LLMs. - LangGraph + YahooFinance Tutorial:
@tarikkaoutarprovided a video guide explaining how to create an AI stock analysis chatbot using LangGraph, Function call, and YahooFinance, enhancing understanding of multi-agent applications.
Links mentioned:
- Join the ONE PERCENT CLUB Discord Server!: Check out the ONE PERCENT CLUB community on Discord - hang out with 16193 other members and enjoy free voice and text chat.
- LangGraph + Function Call+ YahooFinance = Multi-Agent Application: #chatbot #animation #trading #ai #machinelearning #datascience In this video, you will make an AI stock analysis chatbot with LangGraph, Function call and Cā¦
- Groq: Insanely Fast Inference š | Worldās First Language Processing Unit (LPU): In this video, I will explain about Groq who introduced Worldās first Language Processing Unit (LPU) designed for AI applications (LLMs). I will show you howā¦
OpenAccess AI Collective (axolotl) ā· #general (44 messagesš„):
-
Trouble in Jupyter Town:
@nruaifshared a log indicating issues with Jupyter notebooks, showing error messages related to extensions being linked and a Bad config encountered during initialization.@nanobitzchimed in asking if it was a template or Jupyter issue. -
BitNet b1.58 Makes Waves:
@_dampfshared an arXiv paper on BitNet b1.58, a 1-bit LLM that promises significant cost-efficiency with performance matching full-precision models.@nanobitzmentioned itās not just a quantization method but a new architecture. -
Axolotl User Survey Outreach:
@caseus_is seeking feedback through a questionnaire to improve understanding of axolotl users.@dreamgensuggested making the form more concise to get more responses. -
Mistral Office Hours Announcement:
@casper_aishared an invite to the next Mistral AI office hour. -
Alpaca Formatting for Inferences:
@j_sp_rinquired about formatting inferences to match the training instruction format, and@caseus_responded that specifyingchat_template: alpacain the axolotl YAML will handle it.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- TinyBox packs a punch with six of AMDās fastest gaming GPUs repurposed for AI ā new box uses Radeon 7900 XTX and retails for $15K, now in production: Startup wants to offer high AI performance using Radeon RX 7900 XTX.
- Reddit - Dive into anything: no description found
- Axolotl End User Questionnaire: no description found
OpenAccess AI Collective (axolotl) ā· #axolotl-dev (9 messagesš„):
- KTO Trainer Implementation Inquiry:
@giftedgummybeeshared a link to Huggingfaceās documentation on the Kahneman-Tversky Optimization (KTO) Trainer and asked@257999024458563585if there are any plans to implement it.@caseus_responded affirmatively, suggesting they might work on it the following week unless someone else takes it up earlier. - Sophia: A Speedy Optimizer:
@casper_aidiscussed the potential of Sophia optimizer being twice as fast as Adam algorithms and supplied the implementation link (not torch) for Sophia, highlighting its advantage in efficiency over traditional optimization methods. - Innovative Training with DropBP:
@suikamelonbrought up a study on Dropping Backward Propagation (DropBP), which reduces computational costs of neural network training while preserving accuracy by dropping layers during backward propagation. - Starcoder2 Training Support:
@faldoreinquired about support for Starcoder2, providing a link to its GitHub repository.
Links mentioned:
- DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation: Training deep neural networks typically involves substantial computational costs during both forward and backward propagation. The conventional layer dropping techniques drop certain layers during traā¦
- KTO Trainer: no description found
- Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training: Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variantā¦
- levanter/src/levanter/optim/sophia.py at main Ā· stanford-crfm/levanter: Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax - stanford-crfm/levanter
- GitHub - bigcode-project/starcoder2: Home of StarCoder2!: Home of StarCoder2! Contribute to bigcode-project/starcoder2 development by creating an account on GitHub.
OpenAccess AI Collective (axolotl) ā· #general-help (22 messagesš„):
- Pondering Plausible Intentions:
@nafnlaus00floated the idea of prompting a sophisticated language model to generate intentionally wrong answers that seem plausible but contain flaws leading to incorrect conclusions, though no further discussion ensued. - Tool Swap Troubles:
@stoicbatmancontemplated switching from Runpod to Vast AI due to cost concerns and sought the communityās experience comparison;@nanobitzresponded noting that although cheaper, Vast AI doesnāt abstract machine details and offers variable machine quality. - Confusing Commit Conundrums:
@karisnaexpressed disappointment that their commit to rewrite documentation for axolotl wasnāt accepted and pointed out a possible oversight where WSL2 setup for Windows isnāt sufficiently emphasized; however,@nanobitzreplied looking to clarify if the documentation issue had been addressed. - Benchmarks for the Brainy:
@jovial_lynx_74856inquired about running benchmarks on a model finetuned with Axolotl, and@nanobitzsuggested looking at lm_eval_harness on Github, affirming thereās no direct integration for benchmarking within Axolotl itself. - Save Setting Snafu: Concerned about a saving discrepancy,
@duke001.asked why settingsaves_per_epochto 4 andnum_epochsto 4 resulted in only 4 checkpoints instead of the expected 16;@nanobitzhinted at a resolution suggesting an adjustment to the save limit.
Links mentioned:
axolotl/src/axolotl/core/trainer_builder.py at 6b3b271925b2b0f0c98a33cebdc90788e31ffc29 Ā· OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
OpenAccess AI Collective (axolotl) ā· #community-showcase (11 messagesš„):
- Mistral Model Rivals ChatGPT 3.5:
@le_messshared that their 7B Mistral model matches the performance of ChatGPT 3.5 for Danish tasks. - Performance Strengthens Through Iterative Training:
@le_messimproved their models by using a synthetic data approach and training over 30 iterations, enhancing responses over time without relying on GPT-4. - Initial Human Curation Leads to Scalable Model Training:
@le_messcurated the first 1000 responses manually, then employed models to generate more data. Subsequent models were trained to identify high-quality responses for further training cycles.
LlamaIndex ā· #blog (4 messages):
-
Groq Accelerates LlamaIndex: The
@GroqIncLPU now officially integrates with LlamaIndex and supportsllama2andMixtralmodels for efficient LLM generation. They announced this development with a cookbook guide, for streamlining application workflows. -
LlamaParse Sees Soaring Usage:
@llama_indexreports significant usage of LlamaParse, leading to important updates, such as working towards uncapped self-serve usage and temporarily increasing the usage cap from 1k pages. Details can be found at this update link. -
Optimizing Hybrid Search with LLMs: A new strategy for better retrieval in hybrid search uses LLMs to categorize queries with few-shot examples and subsequently adjust the alpha parameter.
@llama_indexshares insights into this approach in their latest tweet. -
RAG for Structured and Unstructured Data:
@llama_indexintroduced a blog post by@ClickHouseDBshowcasing a RAG architecture suited for queries involving both unstructured and structured data, housed in the same database. Interested readers can delve into this integration here.
LlamaIndex ā· #general (75 messagesš„š„):
-
Exploring LlamaIndex Documentation Indexing:
@vaguely_happyproposed setting up a service to index the latest LlamaIndex docs, which prompted@cheesyfishesto mention mendable on docs and@whitefang_jrinforming about LlamaParse not currently sending page numbers but work is in progress to add page numbers and labels. -
Clarification on Callbacks in Golang: As
@sansmoraxzquestioned the use ofCallbackHandlerwith native types,@cheesyfishesassured a refactor is in progress for callbacks and advised holding off on concerns for the moment due to expected improvements. -
Debating Reranker Models: In a discussion initiated by
@richard1861regarding the superior reranking model between Colbert and Cohere,@.sysforshared code and suggested using both the FlagEmbeddingReranker and CohereReranker together, despite having no formal metrics to compare their performance. -
Visualizing ReActAgent Pipelines/DAGs:
@mrpurple9389inquired about visualizing the graph for ReActAgent, and while@cheesyfishesclarified that ReActAgent lacks a visual graph,@mrpurple9389further explored visualizing the agent if replicated using pipelines/DAGs. -
Discussions on LlamaIndex vs. Langchain and Compatibility:
@tr1ckydevsought clarification on the differences between LlamaIndex and Langchain, with@cheesyfishesexplaining that LlamaIndex focuses on connecting data to LLMs while Langchain is more of a comprehensive library. Follow-up queries included compatibility inquiries, indicating that LlamaIndex can be integrated with various vector databases and LLM platforms.
Links mentioned:
- Introducing LlamaCloud and LlamaParse ā LlamaIndex, Data Framework for LLM Applications: LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models (LLMs).
- Arize Phoenix - Phoenix: no description found
- Ollama - Llama 2 7B - LlamaIndex š¦ v0.10.14: no description found
LlamaIndex ā· #ai-discussion (5 messages):
- Model Decay Woes: User
@.sysforexpressed concerns that their models have been generating insane responses recently, questioning whether models decay over time with the hypothesis that nothing else has changed in the setup. - Cheesyfishes to the Rescue:
@cheesyfishesclarified that models do not decay over time, but longer inputs or inputs not structured as instructions could potentially lead to issues with the modelās responses. - Observable Decline in Fine-tuned Performance: Further to the decay question,
@.sysfornoticed issues specifically with the ābetterā fine-tuned models, while running tests to compare against baseline models.
OpenRouter (Alex Atallah) ā· #general (49 messagesš„):
-
Claude Models Prompt Errors:
@quentmakerreported an error when a chat has more than 8 alternating messages between user and assistant, affecting various Anthropicsā Claude models.@louisgvacknowledged the issue and promised a fix is in the works. -
OpenRouter Addressing Turn Order Issues:
@alexatallahsuggested a temporary workaround for the prompt issue by changing the first assistant message to a system message. Meanwhile, development is underway to handle conversations that begin with a message from the assistant. -
Rate Limit Discussions for OpenRouter:
@gunpal5_43100inquired about rate limits when using OpenRouter for generating large numbers of articles.@alexatallahclarified that each user with their own API key would have separate rate limits, which cumulatively should provide sufficient throughput. -
Caching Concerns with Mistral: Several users, including
@natefyi_30842and@spaceemotion, observed similarities in responses when repeating prompts to Mistral models, leading to speculation of caching behavior by the API.@alexatallahconfirmed that Mistralās API might cache queries. -
Compatibility with Prepaid Cards:
@fakeleiikunasked about OpenRouterās support for prepaid cards, particularly those provided by e-wallet apps.@louisgvindicated that while some prepaid cards might work, virtual cards from unsupported banks might not be accepted due to Stripeās fraud prevention measures.
Links mentioned:
- no title found): no description found
- OpenRouter: Build model-agnostic AI apps
CUDA MODE ā· #triton (10 messagesš„):
- Benchmark Script Enhanced:
@hdcharles_74684improved a benchmark script for comparing Triton kernel performance, which could be beneficial for int8 weight-only linear kernels potentially outperforming cuBLAS for batch sizes greater than 1, impacting sdxl-fast. The script is available on GitHub, and contains various kernels, including fast kernel for bs=1, int4 tinygemm, and uint4x2 triton kernel. - PR to cuda-mode/lectures Suggested:
@marksaroufimsuggested@hdcharles_74684make a pull request to the cuda-mode lectures repository on GitHub to make the benchmark script easily accessible. - Potential Triton Optimizations Discussed:
@chhilleementioned that Torch.compile could efficiently handle batch size of 2, which could alleviate the main bottleneck in question. - Tensor Performance Fixed on Radeon:
@iron_boundreported a significant improvement in tensor performance on Radeon RX 7900 XTX graphics card after fixing an issue with WMMA hooks in mlir/llvm. - Debugging Issue with Triton Versions:
@kierandidiencountered an issue with the Triton debugger in versions 3.0.0 and 2.2.0 regarding theinterpretargument.@andreaskoepfand@marksaroufimconfirmed that the method was deprecated and suggested settingTRITON_INTERPRETenvironment variable as a workaround. - Feedback on Tritonās Stability:
@andreaskoepfshared experiences of instabilities with Triton compared to CUDA, citing unexplained segfaults and inconsistent results.@marksaroufimrequested an example to compare the situations before and after the segfaults, following similar feedback observed on Twitter.
Links mentioned:
- GitHub - cuda-mode/lectures: Material for cuda-mode lectures: Material for cuda-mode lectures. Contribute to cuda-mode/lectures development by creating an account on GitHub.
- script for comparing performance of several linear triton kernels across several shapes: script for comparing performance of several linear triton kernels across several shapes - linear_triton_kernels.py
CUDA MODE ā· #cuda (6 messages):
- Inquiry about GPU Intrinsics: User
@drexaltasked if a claim made in a tweet was true, seeking clarification from fellow CUDA MODE Discord members. - Response to FP8 Intrinsics Query:
@zippikaclarified that the claim in question was false and provided a link to the CUDA math API docs that still lists FP8 intrinsics. - Clarifying the Purpose of FP8:
@zippikaunderlined that FP8 serves mainly as a data format rather than being extensively used for computations.
Links mentioned:
CUDA Math API :: CUDA Toolkit Documentation: no description found
CUDA MODE ā· #torch (13 messagesš„):
-
No Appetite for Polyhedral:
@chhilleeexpresses skepticism about the utility of polyhedral compilation in optimizing sharding for deep learning, suggesting that the key question is defining the cost function. -
Search Space Skepticism: In a discussion with
@andreaskoepf,@chhilleelikens the challenge of finding optimal shardings in deep learning to the ongoing developments in new ML architectures. -
Contemplating Optimal Mappings:
@gogators.muses that the space of valid mappings from deep learning programs to hardware may be smaller and less complex than the space of all possible deep learning programs. -
DL Program Optimization Not So Trivial:
@gogators.backtracks from describing the process of finding efficient mappings of deep learning computations as ātrivial,ā while expressing surprise if top AI institutions arenāt already investigating this area. -
Debating Deep Learning Computability:
@telepath8401humorously challenges@gogators.ās initial use of ātrivial,ā prompting a clarification about the feasibility of optimizing operation mappings given homogeneity and explicit dependencies in deep learning operators.
CUDA MODE ā· #ring-attention (15 messagesš„):
- New Ring Attention Implementations:
@andreaskoepfshared lucidrainsā implementation of Ring Attention with custom Triton kernels and proposed to compare its correctness and performance with another implementation by zhuzilin. - Backward Pass Bug Hunt:
@andreaskoepfmentioned that Phil pointed out an issue with the backward pass, which might need fixing, as discussed in this GitHub issue. - GPU Compatibility Troubles:
@nthanhtam.and@jamesmelreported problems when running the Ring Attention implementation on GPUs, while@ericauldnoted the assertion script works on CPU. - Code Inconsistencies and Errors:
@ericauldobserved multiple errors in the code when trying to run it with Melvinās suggestions, such as typos and missing imports, which led to additional Triton-related issues. - Commit History Suggests Problems:
@iron_boundhinted that something might have broken in lucidrainsā Ring Attention implementation by referring to the commit history on GitHub.
Links mentioned:
- GitHub - lucidrains/ring-attention-pytorch: Explorations into Ring Attention, from Liu et al. at Berkeley AI: Explorations into Ring Attention, from Liu et al. at Berkeley AI - lucidrains/ring-attention-pytorch
- Commits Ā· lucidrains/ring-attention-pytorch: Explorations into Ring Attention, from Liu et al. at Berkeley AI - Commits Ā· lucidrains/ring-attention-pytorch
- A ring attention with flash attention kernel implementation Ā· Issue #4 Ā· lucidrains/ring-attention-pytorch: Hi! Thank you for your work on implementing the ring attention in pytorch! Iāve just tried to implement a ring_flash_attn_qkvpacked_func (corresponding to flash_attn_qkvpacked_func in flash attentā¦
- Compare ring-flash-attention & ring-attention-pytorch Ā· Issue #11 Ā· cuda-mode/ring-attention: lucidrains & zhuzilin were hard working the last days and have completed the following two ring-attention implementations: lucidrains/ring-attention-pytorch zhuzilin/ring-flash-attention Create a ā¦
Interconnects (Nathan Lambert) ā· #news (10 messagesš„):
-
Arthur Mensch Sets the Record Straight:
@arthurmenschclarified misconceptions about their recent announcements, reiterating the commitment to open-weight models with 1.5k H100s, a reselling agreement with Microsoft, and maintaining independence as a European company with global ambitions. He highlighted the growing interest for Le Chat and Mistral Large on La Plateforme and Azure, with a plan to iterate quickly. Check out the clarifications. -
Nathan Endorses Public Clarifications: After the tweet from
@arthurmensch,@natolambertexpressed approval, describing the act of providing such public clarifications on social media as ādef legit vibesā. -
Announcing StarCoder2 and The Stack v2:
@BigCodeProjectlaunched StarCoder2, a model trained with a 16k token context and a massive 4T+ token repository-level information, built upon The Stack v2 which contains over 900B+ tokens. The code, data, and models are fully open and available, marking a significant contribution to the community. Discover StarCoder2. -
Meta Prepares to Launch Llama 3: A tweet from
@Reutersreported that Meta plans to release a new AI language model dubbed Llama 3 in July, which could signify another major competition in the AI field. The details were reported by The Information. Read more from Reuters. -
G 1.5 Pro with Extended Context Coming to Nathan:
@natolambertannounced excitement for getting access to G 1.5 Pro with a 1 million token context, planning to use it for processing podcasts and other content, and mentioned a potential article workshop based on the experience, if thereās interest.
Links mentioned:
- Tweet from BigCode (@BigCodeProject): Introducing: StarCoder2 and The Stack v2 āļø StarCoder2 is trained with a 16k token context and repo-level information for 4T+ tokens. All built on The Stack v2 - the largest code dataset with 900B+ tā¦
- Tweet from Arthur Mensch (@arthurmensch): Clarifying a couple of things since weāre reading creative interpretations of our latest announcements: - Weāre still committed to leading open-weight models! We ask for a little patience, 1.5k H100s ā¦
- Tweet from Reuters (@Reuters): Meta plans launch of new AI language model Llama 3 in July, The Information reports http://reut.rs/3TgBgFJ
Interconnects (Nathan Lambert) ā· #random (30 messagesš„):
-
Nathan Lambert Tunes into Demis Hassabis:
@natolambertshared an episode of a podcast with Demis Hassabis, CEO of Google DeepMind, discussing superhuman AI scaling, AlphaZero training atop LLMs, and AI governance. The podcast can be watched on YouTube or listened to on platforms like Apple Podcasts and Spotify. -
Considering Openness in AI Discussions:
@natolambertand@mike.lambertdiscussed the merits of having open conversations about completely open AI and the differences in mental models as opposed to conversations on platforms like Twitter. -
Name Coincidence Among Users: User
@xeophon.inquired if@natolambertand@mike.lambertwere related due to the similarity in their last names; it was confirmed to be a coincidence. -
Anthropic Association Confirmation:
@mike.lambertconfirmed employment at Anthropic and took a stance on sharing information in the chat, indicating a preference to engage in discussions as themselves, not as a representative of their employer. -
The Quest for the LAMB Emoji:
@natolamberthumorously lamented the lack of an appropriate emoji for āLAMB,ā expressing frustration with the search results pointing to a steak emoji š„©.
Links mentioned:
Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, Rogue Nations Threat: āscaling is an artformā
LLM Perf Enthusiasts AI ā· #gpt4 (2 messages):
- Inquiry About Benchmark Automation:
@ampdotasked if a benchmark is available as an automated script, showing interest in trying out such a tool. - Enthusiasm for Benchmark Automation:
@dare.aialso expressed interest in the automated benchmark script and is looking forward to trying it out, tagging<@757392677280022549>for a potential response.
LLM Perf Enthusiasts AI ā· #opensource (4 messages):
- Anticipated Spring Launch for Llama 3: User
@res6969expressed that their expectation was for Llama 3 to be released in spring, suggesting that the current timeline is further than anticipated. - Possible Last-Minute Improvements for Llama 3:
@potrockexpressed hope that the delay of Llama 3 might be due to a last-minute attention update, hinting at improvements that could be included in the release. - Enthusiasm for Gemini Ring Attention:
@potrockmentioned that the incorporation of Gemini ring attention would be a cool feature for Llama 3, indicating interest in this specific attention mechanism.
LLM Perf Enthusiasts AI ā· #offtopic (1 messages):
- Time Crunch for LLM Testing: User
@jeffreyw128expressed a desire to test new LLMs but emphasized the significant effort required to āget a good vibe check on eachā due to time constraints.
LLM Perf Enthusiasts AI ā· #openai (3 messages):
- ChatGPT Search Update Rumors:
@jeffreyw128mentioned rumors that OpenAI might be updating their web search in ChatGPT this week, seeking confirmation from others. - In Search of OpenAI Insights: User
@res6969acknowledged not having heard such rumors and expressed a need to find better sources for OpenAI-related information. - Looking for codeinterpreter Production Resources:
@res6969inquired if anyone had resources on using codeinterpreter in production environments, indicating an interest in practical applications.
DiscoResearch ā· #general (6 messages):
-
DiscoLM Template Clarification: User
@bjoernppointed out the importance of using the DiscoLM template for chat context tokenization, referencing the Hugging Face documentation on chat templating. -
Issues with llamaindex Chunker for Code:
@sebastian.bodzareported that the llamaindex chunker for code was significantly malfunctioning, producing one-liners and disregarding thechunk_linesoption. -
Sanity Check on Training German RAG Models:
@johannhartmannis creating a German dataset for Retrieve-and-Generate (RAG) tasks, utilizing Deutsche Telekomās Wikipedia content-question pairs, and sought feedback on the approach to improve reliability of German-speaking Mistral 7b models. -
Goliath versus DiscoLM for German Language Tasks:
@philipmayquestioned if Goliath is the superior model for German language skills and shared a link to its model card on Hugging Face. The discussion evolved with@johannhartmannsuggesting that DiscoResearch/DiscoLM-120b might perform better due to its training on German content. -
Advice on Generating Negative Samples for Datasets:
@philipmaysuggested a successful method to generate negative samples by directing a language model to alter given answers to be factually incorrect, for the purposes of building a more effective dataset for RAG training.
Links mentioned:
- alpindale/goliath-120b Ā· Hugging Face: no description found
- Templates for Chat Models): no description found
DiscoResearch ā· #discolm_german (1 messages):
-
German Prompts in EQ-Bench:
@crispstrobeshared that EQ-Bench now supports German prompts, showing strong correlation with various benchmarks like MMLU and Arena Elo. Link to the GitHub pull request is here. -
GPT-4 Leads in Performance: According to a comparison shared by
@crispstrobe, GPT-4-1106-preview scored 81.91 in the EQ-Bench German prompts evaluation, outperforming other models including GPT-3.5, various Mistral versions, anddiscolm-german-laser. -
Evaluating German Language Models: The message lists EQ-Bench scores for different models, highlighting that even a model like
german-assistant-v7has a score of 35.48 which could indicate a baseline for German language model performance. -
Translation Scripts Included:
@crispstrobealso mentioned including translation scripts with the benchmarks, stating that these were set up quickly and have the potential for further improvement, such as manual review by a student. -
Automatic Translation with GPT-4: The German prompts were automatically translated using ChatGPT-4-turbo, showing that sophisticated models can facilitate the translation of test or training sets, a process that can be adapted or changed to other translation services like āfree Geminiā.
Links mentioned:
Build software better, together:): GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.
Datasette - LLM (@SimonW) ā· #ai (4 messages):
- Struggle Against Verbose JSON Responses: User
@dbreunigmentioned the frequent need to clean up noisy json responses but did not elaborate on the specific methods or function used. - Tackling Claudeās Introductory Phrases: User
@justinpinkneyshared a tip on how to avoid intro sentences like āSure hereās aā¦ā from Claude by using the initial characters control, referencing Anthropicās documentation. They suggested starting with<rewrite>or enforcing the response to initiate with{. - Claudeās Tenacious Explanations: User
@derekpwillisacknowledged trying various methods to make Claude deliver less verbose outputs, such as forcing the AI to start with{, yet Claude persists in providing explanations before the actual content.
Links mentioned:
Ask Claude for rewrites: If Claude gives a response that is close to, but not quite what youāre looking for, you can ask Claude to rewrite it. In Slack this can be as simple as telling Claude to āTry againā aftā¦
Skunkworks AI ā· #off-topic (1 messages):
pradeep1148: https://www.youtube.com/watch?v=ikIgy0qlif8&feature=youtu.be
Skunkworks AI ā· #general (1 messages):
- Recruitment Inquiry in DMs: User
.papahhreached out to@1117586410774470818with a direct message, hinting at a potential job opportunity and expressing interest in the recipientās participation.
Alignment Lab AI ā· #looking-for-collabs (1 messages):
- Exploring the Roots of Cross-Species Values:
@taodoggyis seeking collaborators for a project aiming to understand the biological and evolutionary origins of values shared across species, refine the definition of values, and analyze how these are expressed in various cultures. They provided a brief overview with a Google Docs link.
Links mentioned:
Uncovering the Origins of Values: A Biology and Cognition-Based Approach for AI Alignment: no description found
AI Engineer Foundation ā· #general (1 messages):
- AI Engineer Recruitment Advice Sought: User
@peterg0093is looking to start recruiting AI engineers in the UK and requests examples of good job descriptions to avoid deviating from any standard language in the field. He encourages users to reach out if they have useful references or resources.