Onetime IRL callout: If youāre in SF, join Dylan Patel (aka āthat semianalysis guyā who wrote the GPU Rich/Poor essay) for a special live Latent Space special event tomorrow. Our first convo was one of last yearās top referenced eps.
As hinted last year, HuggingFace/BigCode has finally released StarCoder v2 and The Stack v2. Full technical report here.
StarCoder 2: SOTA for size (3B and 15B)
StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages fromĀ The Stack v2, with opt-out requests excluded. The model usesĀ Grouped Query Attention,Ā a context window of 16,384 tokensĀ withĀ a sliding window attention of 4,096 tokens, and was trained using theĀ Fill-in-the-Middle objectiveĀ on 4+ trillion tokens.
Since it was only just released, best source on evals is BigCode for now:
The Stack v2: 10x bigger raw, and 4.5x bigger deduped (900B Tokens)
We are experimenting with removing Table of Contents as many people reported it wasnāt as helpful as hoped. Let us know if you miss the TOCs, or theyāll be gone permanently.
AI Twitter Summary
AI and Machine Learning Discussions
- FranƧois Chollet remarks on the nature of LLMs, emphasizing that output mirrors the training data, capturing human thought patterns.
- Sedielem shares extensive thoughts on diffusion distillation, inviting community feedback on the blog post.
- FranƧois Chollet differentiates between current AI capabilities and true intelligence, focusing on the efficiency of skill acquisition.
- Stas Bekman raises concerns about the ML communityās dependency on a single hub for accessing weight copies, suggesting the need for a backup hub.
Executive Shifts and Leadership
- Saranormous highlights leadership change at $SNOW, welcoming @RamaswmySridhar as the new CEO and applauding his technical and leadership expertise.
Technology Industry Updates
- DeepLearningAI rounds up this weekās AI stories, including Gemini 1.5 Proās rough week, Groq chipsā impact on AI processing speed, and a discussion on version management in AI development by @AndrewYNg.
- KevinAFischer celebrates his feature in Tech Crunch as an early user of the Shader app by @shaderapp and @daryasesitskaya.
Innovation and Technical Insights
- Andrew N Carr discusses the potential of fitting 120B parameter models on consumer GPUs as per the 1.58 Bit paper, emphasizing breakthroughs in VRAM efficiency.
- Erhartford highlights a real-time EMO lip sync model, suggesting its integration for innovative applications.
Memes/Humor
- C_valenzuelab draws a humorous analogy, stating that āairplanes didnāt disrupt the bicycle marketā.
- KevinAFischer jokes about the economics of using LLMs, poking fun at the current state of AI development.
- KevinAFischer makes a light-hearted comment about ideas being ahead of their time.
Miscellaneous Observations
- Margaret Mitchell questions diversity in news coverage of the Gemini fiasco - 2806 impressions
- Kevin Fischer humorously touches on repeating himself - 732 impressions
- Zach talks about the need for fair tax rates for the wealthy - 492 impressions
AI Development and Infrastructure
- abacaj mentions the need for backing up weights following an HF outage - 1558 impressions
- Together Compute announces the launch of OLMo-7B-Instruct API from @allen_ai - 334 impressions
- A discussion on the ternary BitNet paperās potential to revolutionize model scalability - 42 impressions
AI Twitter Narrative
The technical and engineer-oriented Twitter ecosystem has been buzzing with significant discussions spanning AI, blockchain, leadership transitions in tech, and some light-hearted humor.
Regarding AI and Machine Learning, FranƧois Cholletās reflection on LLMs as mirrors to our inputs, alongside Daniele Grattarolaās deep dive into diffusion distillation, underscore critical thinking about the essence and future of AI technologies. Reinforcing the importance of diversified safeguarding of machine learning models, Stas Bekmanās proposition for a secondary hub for model weights has caught the communityās attention, highlighting the communityās resilience in facing practical challenges.
In the leadership and innovation arena, the leadership transition at $SNOW garnered significant engagement, reflecting the continuous evolution and admiration for leadership within tech organizations.
Humor and memes remain a vital part of the discourse, with tweets like Cristóbal Valenzuelaās observation about the non-competition between airplanes and bicycles bringing a light-hearted perspective to innovation and disruption.
On various miscellaneous observations, Margaret Mitchellās call for more diverse perspectives in tech reporting highlights the importance of inclusivity and varied viewpoints in shaping our understanding of tech events.
Lastly, discussions around AI development and infrastructure saw practical considerations taking the forefront, as noted by abacajās preparation for possible future outages by backing up model weights. This operational resilience mirrors the broader strategic resilience seen across the technical and engineering community.
PART 0: Summary of Summaries of Summaries
ChatGPT Model Evaluations and Data Integrity on TheBloke Discord
- Detailed ChatGPT Model Comparisons: Members critically evaluated ChatGPT models, including GPT-4, Mixtral, and Miqu, focusing on API reliability and comparative performance. Specific concerns were raised about training data contamination from other AI outputs, potentially degrading model quality and reliability.
Technological Innovations and AI Deployment on Mistral Discord
- NVIDIA RAG Technical Limitations: NVIDIA's demo, showcasing retrieval-augmented generation (RAG), was critiqued for its 1024 token context limit and response coherence issues. The critique extended to NVIDIA's implementation choices, including the use of LangChain for RAG's reference architecture, hinting at broader discussions on optimizing AI model architectures for better performance.
Qualcomm's Open Source AI Models on LM Studio Discord
- Qualcomm's Contribution to AI Development: Qualcomm released 80 open source AI models on Hugging Face, targeting diverse applications in vision, audio, and speech technologies. Notable models include "QVision" for image processing, "QSpeech" for audio recognition, and "QAudio" for enhanced sound analysis. These models represent Qualcomm's push towards enriching the AI development ecosystem, offering tools for researchers and developers to innovate in machine learning applications across various domains. The release was aimed at fostering advancements in AI modeling and development, specifically enhancing capabilities in vision and audio processing, as well as speech recognition tasks.
These updated summaries provide a more focused view on the specific areas of interest and discussion within the respective Discord communities. They highlight the depth of technical scrutiny applied to AI models, the identification of performance limitations and potential improvements in AI technologies, and the specific contributions of Qualcomm to the open-source AI landscape, underlining the continuous evolution and collaborative nature of AI research and development.
PART 1: High level Discord summaries
TheBloke Discord Summary
- Spam Alert in General Chat: Users reported a spam incident involving
@kquant
, with Discordās spam detection system flagging his activity after excessively contacting over 100 people with identical messages. - ChatGPT Variants Under Scrutiny: Diverse experiences with ChatGPT models were discussed, including GPT-4ās API reliability and comparisons with Mixtral or Miqu models. Concerns were raised over training data contamination from other AI outputs, potentially compromising quality.
- Mixed Results in Model Mergers: Dialogue highlighted the uncertainty in model merging outcomes, emphasizing the role of luck and model compatibility. Merging tactics such as spherical linear interpolation (slerp) or concatenation were suggested in the specialized channels.
- Innovative Roleplay with LLMs: Techniques to enhance character consistency in role-play involve using detailed backstories and traits for LLMs. Specific models like Miqu and Mixtral were favored for these tasks, though longer context length could reduce coherence.
- Pacing AI Training and Fine-tuning: Users exchanged training tips, including using Perplexity AI and efficient methods like QLoRA to curb hardware demand. The importance of validation and deduplication was stressed, alongside managing model generalization and hallucination.
Links to consider:
- For looking into detailed personalities and character backstories in AI role-play, one might explore the strategy explanations and datasets at Hugging Face.
- Searching for efficient training techniques could lead AI engineers to MAXās announcement about their platform aimed at democratizing AI development via an optimized infrastructure, detailed in their Developer Edition Preview blog post here.
Mistral Discord Summary
-
NVIDIAās Demo Faces Criticism for RAG Implementation: The NVIDIA āChat with RTXā demo showcasing retrieval-augmented generation (RAG) faced criticism for limiting context size to 1024 tokens and issues with coherent responses. Discussions hinted at concerns with NVIDIAās use of LangChain in RAGās reference architecture.
-
Mistral AI Discussions Span Licensing to Open Weights and Hardware Requirements: Conversations touched on Mistral AIās use of Metaās LLaMa model, anticipation for future open weight models following Mistral-7B, and hardware requirements for running larger models, like Mistral 8x7B, which may need at least 100GB of VRAM. Users considered the use of services like Together.AI for deployment assistance.
-
Model Quantization and Deployment Discussions Highlight Constraints: Technical discussions included constraining Mistral-7B to specific document responses, the stateless nature of language models, and the limitations of quantized models. Quantization reducing parameter counts for Mistral-7B and the necessity for large VRAM for full precision models were underscored.
-
Mistral Platform Intricacies and Function Calling Discussed: Users shared experiences and obstacles with Mistral function calls and reported on the necessity for specific message role orders. Some referred to the use of tools like Mechanician for better integration with Mistral AI.
-
Educational Tools and the Potential of Specialized Models: One user showcased an app for teaching economics using Mistral and GPT-4 AI models, while discussions touched on the specialized training of models for tasks like JavaScript optimization. An expressed need for improved hiring strategies within the AI industry surfaced among chats.
The conversations reveal technical discernment among the users, highlighting both enthusiasm for AIās advancements and practical discussions on AI model limitations and ideal deployment scenarios.
OpenAI Discord Summary
-
Loader Showdown: lm studio vs oobabooga and Jan dot ai: lm studio was criticized for requiring manual GUI interaction to kickstart the API, making it a non-viable option for automated website applications, prompting engineers to suggest alternatives oobabooga and Jan dot ai for more seamless automation.
-
AI Moderation and OpenAI Feedback: A message removed in a discussion about Copilot AI due to automod censorship led to suggestions to report to Discord mods and submit feedback directly through OpenAIās Chat model feedback form, with community members discussing the extent of moderation rules.
-
Mistralās Power and Regulation Query: The Mistral model, known for its powerful, uncensored outputs was compared to GPT-4, resulting in a conversation about the impact of European AI regulation on such models. A related YouTube video was shared, illustrating how to run Mistral and its implications.
-
Advancing Chatbot Performance: Enhancing GPT-3.5-Turbo for chatbot applications sparked a debate on achieving performance on par with GPT-4, with users discussing fine-tuning techniques and suggesting utilizing actual data and common use cases for improvement.
-
AI Certification vs. Real-world Application: For those seeking AI specialization, the community highlighted the primacy of hands-on projects over certifications, recommending learning resources such as courses by Andrew Ng and Andrej Karpathy, available on YouTube.
LM Studio Discord Summary
Model Compatibility Queries Spark GPU Discussions: Engineers engaged in detailed explorations of LLMs, such as Deepseek Coder 6.7B and StarCoder2-15B, and their compatibility with Nvidia RTX 40 series GPUs, discussing optimization strategies for GPUs like disabling certain features on Windows 11. A focus on finding the best-fitting models for hardware specifications was observed, underlined by the launch news of StarCoder2 and The Stack v2, with mentions of LM Studioās compatibility issues, especially on legacy hardware like the GTX 650.
Hugging Face Outage Disrupts Model Access: An outage at Hugging Face caused network errors for members trying to download models, affecting their ability to search for models within LM Studio.
Qualcomm Unveils 80 Open Source Models: Qualcomm released 80 open source AI models on Hugging Face, targeting vision, audio, and speech applications, potentially enriching the landscape for AI modeling and development.
LLM Functionality Expansions: Users exchanged insights on enhancing functionalities within LM Studio, such as implementing an accurate PDF chatbot with Llama2 70B Q4 LLM, seeking guidance on adding image recognition features with models like PsiPi/liuhaotian_llava-v1.5-13b-GGUF/
, and expressing desires for simplified processes in downloading vision adapter models.
Hardware Hubris and Hopes: Discussions thrived around user experiences with hardware, from reminiscing about older GPUs to sharing frustrations over misrepresented specs in an e-commerce setting. One user advised optimizations for Windows 11, while TinyCorp announced a new hardware offering, TinyBox, found here. There was also speculation about the potential for Nvidia Nvlink / SLI in model training compared to inference tasks.
HuggingFace Discord Summary
-
Cosmopediaās Grand Release: Cosmopedia was announced, a sizable synthetic dataset with over 25B tokens and 30M files, constructed by Mixtral. It is aimed at serving various AI research needs, with the release information accessible through this LinkedIn post.
-
Hugging Face Updates Galore: The
huggingface_hub
library has a new release 0.21.0 with several improvements, and YOLOv9 made its debut on the platform, now compatible with Transformers.js as per the discussions and platforms like Hugging Face spaces and huggingface.co/models. -
DSPy Grows Closer to Production: Exploration of DSPy and Gorilla OpenFunctions v2 is underway to transition from Gradio prototypes to production versions. The tools promise enhanced client onboarding processes for foundation models without prompting, and the discussions and resources can be found in repositories like stanfordnlp/dspy on GitHub.
-
BitNet Bares Its Teeth: A new 1-bit Large Language Model, BitNet b1.58, boasted to preserve performance with impressive efficiency metrics, is discussed with its research available via this arXiv paper.
-
Inference Challenges and Solutions: In the field of text inference, an AI professional ran into issues when trying to deploy the text generation inference repository on a CPU-less and non-CUDA system. This highlights typical environmental constraints encountered in AI model deployment.
LAION Discord Summary
-
AIās Ideogram Stirs Interest: Engineers discussed the release of a new AI model from Ideogram, drawing comparisons with Stable Diffusion and shedding light on speculated quality matters pertaining to unseen Imagen samples. A user shared a prompt result that sparked a debate on its prompt adherence and aesthetics.
-
Integration of T5 XXL and CLIP in SD3 Discussed: There have been discussions around the potential integration of T5 XXL and CLIP models into Stable Diffusion 3 (SD3), with participants expecting advancements in both the precision and the aesthetics of upcoming generative models.
-
Concerns Over AI-Generated Art: A legal discussion unfolded concerning AI-generated art and copyright laws, referencing a verdict from China and an article on copyright safety for generative AI, highlighting uncertainty in the space and varied industry responses to DMCA requests.
-
Spiking Neural Networks Back in Vogue?: Some members considered the potential resurgence of spiking neural networks with advanced techniques like time dithering to improve precision, reflecting on historical and current research approaches.
-
State-of-the-Art Icon Generation Model Released: A new AI icon generation model has been released on Hugging Face, developed with a personal funding of $2,000 and touted to create low-noise icons at 256px, although scale limitations were acknowledged by its creator.
Nous Research AI Discord Summary
-
Emoji Storytelling on GPT-5ās No-show: Community members used a sequence of emojis to express sentiments about GPT-5ās absence, oscillating between salutes, skulls, and tears, while revering GPT iterations up to the mythical GPT-9.
-
Dellās Dual Connection Monitors and Docks Intrigue Engineers: A YouTube review of Dellās new 5K monitor and the Dell Thunderbolt Dock WD22TB4 piqued interest for their capabilities to connect multiple machines, with eBay as the suggested source for purchases.
-
1-bit LLMs Unveiled with BitNet B1.58: The arXiv paper revealed BitNet b1.58 as a 1-bit LLM with performance on par with full-precision models, highlighting it as a cost-effective innovation alongside a mention of Nicholas Carliniās LLM benchmark.
-
Exploring Alternative Low-Cost LLMs and Fine-Tuning Practices: Users discussed alternatives to GPT-4, the effect of small training dataset sizes, and the potential use of Directed Prompt Optimization (DPO) to improve model responses.
-
Cutting-Edge Research and New Genomic Model Debut: Stanfordās release of HyenaDNA, a genomic sequence model, alongside surprising MMLU scores from CausalLM, and resources on interpretability in AI, such as Representation Engineering and tokenization strategies, were the hot topics of discussion.
Latent Space Discord Summary
-
Noam Shazeer on Coding Style:
@swyxio
highlighted Noam Shazeerās first blog post on coding style and shape suffixes, which may interest developers who are keen on naming conventions. -
AI in Customer Service: Enthusiasm was expressed around data indicating that LLMs can match human performance in customer service, potentially handling two-thirds of customer service queries, suggesting a pivot in how customer interactions are managed.
-
Learning with Matryoshka Embeddings: Members discussed the innovative āMatryoshka Representation Learningā paper and its applications in LLM embeddings with adaptive dimensions, with potential benefits for compute and storage efficiency.
-
MRL Embeddings Event: An announcement for an upcoming event by
<@206404469263433728>
where the authors of the MRL embeddings paper will attend was made, providing an opportunity for deep discussions on representation learning in the#1107320650961518663
channel. -
Representation Engineering Session:
@ivanleomk
signaled an educational session on Representation Engineering 101 with<@796917146000424970>
, indicating a chance to learn and query about engineering effective data representations in the#1107320650961518663
channel.
Perplexity AI Discord Summary
-
Rabbit R1 Activation Assistance: User
@mithrilman
encountered a non-clickable email link issue when trying to activate the Rabbit R1 promo.@icelavaman
suggested using the email link and reaching out to support. -
Podcast Identity Confirmation: Confusion arose around podcasts using the name āPerplexity AI,ā leading
@icelavaman
to clarify with the official podcast link, while@ok.alex
speculated that the name might be used without authorization for attention or financial gain. -
Comparing AI Model Capabilities: Users explored the strengths and weaknesses of various AI models like Experimental, GPT-4 Turbo, Claude, and Mistral. There was notably divided opinion regarding Mistralās effectiveness for code queries.
-
Brainstorming Perplexity AI Improvements: Suggestions for Perplexity AI included exporting thread responses, a feature currently missing but considered for future updates. Issues also included the absence of file upload options and confusion over product name changes.
-
Model Performance Nostalgia and API Errors: Discussions touched upon glitches in text generation and fond memories of pplx-70b being superior to sonar models.
@jeffworthington
faced challenges with OpenAPI definitions, suggesting the current documentation might be outdated.
Links shared:
- Official Perplexity AI podcasts: āDiscover Daily by Perplexity and āPerplexity AI.
- Getting started with Perplexityās API: pplx-api documentation.
Eleuther Discord Summary
-
Foundation Model Development Cheatsheet Unveiled: A new resource titled The Foundation Model Development Cheatsheet has been released to aid open model developers, featuring contributions from EleutherAI, MIT, AI2, Hugging Face, among others, and focusing on often overlooked yet crucial aspects such as dataset documentation and licensing. The cheatsheet can be accessed as a PDF paper or an interactive website, with additional information in their blog post and Twitter thread.
-
Scaling Laws and Model Training Discussions Heat Up: Discourse ranges from inquiries about cross-attention SSM models, stable video diffusion training, and the nuances of lm-evaluation-harness, to the status of EleutherAIās Pythia model, and an abstract on a 1-bit Large Language Model (LLM). Notable references include a blog post on Multiple Choice Normalization in LM Evaluation and the research paper on the Era of 1-bit LLMs.
-
From Open-Sourced Models To Maze Solving Diffusion Models: The research channel showcased discussions on a variety of AI topics, from open-sourced models and pretraining token-to-model size ratios to diffusion models trained to solve mazes, prompting engineering transfer studies, and the practical challenges of sub 8-bit quantization. Key resources shared include a Stable LM 2 1.6B Technical Report, and a tweet on training diffusion models to solve mazes by FranƧois Fleuret.
-
Neox Query for Slurm Compatibility: User
@muwnd
sought recommendations on running Neox with Slurm and its compatibility with containers. It was highlighted that Neoxās infrastructure does not make assumptions about the userās setup, and a slurm script may be needed for multinode execution. -
Interpretability Techniques and Norms Explored: Conversations in the interpretability channel delved into matrix norms and products, RMSNorm layer applications, decoding using tuned lenses, and the proper understanding of matrix norm terminology. For example, the Frobenius norm is the Euclidean norm when the matrix is flattened, while the ā2-normā is the spectral norm or top singular value.
-
Tweaks for LM Eval Harness and Multilingual Upgrades: Enhancements to the LM Eval harness for chat templates were shared, along with news that higher-quality translations for the Multilingual Lambada have been contributed by
@946388490579484732
and will be included in the evaluation harness. These datasets are made available on Hugging Face.
LangChain AI Discord Summary
-
Confidence in LangChain.js:
@ritanshoo
raised a question regarding confidence score checks when utilizing LangChain.js for RAG. While an immediate answer was not provided, users were referred to the LangChain documentation for in-depth guidance. -
Integration Queries for LangChain: Technical discussions highlighted the possibilities of memory addition to LCEL and effective language integration with LangChain in an Azure-hosted environment. Users were advised to consult official documentation or seek community assistance for specific integration issues.
-
ToolException Workarounds Explored:
@abinandan sought
advice on retrying a tool after aToolException
occurs with a custom tool. The community pointed to LangChain GitHub discussions and issues for potential solutions. -
LangServe Execution Quirks:
@thatdc
reported missing intermediate step details when using langserve, as opposed to direct invocation from the agent class. They identified a potential glitch in theRemoteRunnable
requiring a workaround. -
Summoning Python Template Alchemists:
@tigermusk
sought assistance creating a Python template similar to the one available on Smith LangChain Chat JSON Hub, sparking discussions on template generation. -
āLangChain in your Pocketā Celebrated:
@mehulgupta7991
announced their book āLangChain in your Pocket,ā recently featuring in Googleās Best books on LangChain, highlighting resources for LangChain enthusiasts. -
Beta Testing for AI Voice Chat App: Pablo, an AI Voice Chat app that integrates multiple LLMs and provides voice support without typing, called for beta testers. Engineers were invited to join the team behind this app, leveraging LangChain technology, with an offer for free AI credits.
-
AI Stock Analysis Chatbot Creation Explained: A video tutorial was shared by
@tarikkaoutar
, demonstrating the construction of an AI stock analysis chatbot using LangGraph, Function call, and YahooFinance, catering to engineers interested in multi-agent systems. -
Groqās Hardware Reveal Generates Buzz: An introduction to Groqās breakthrough Language Processing Unit (LPU) suitable for LLMs captivated tech enthusiasts, conveyed through a YouTube showcase shared by
@datasciencebasics
.
(Note: The above summary integrates topics and resources from various channels within the Discord guild, focusing on points of interest most relevant to an engineer audience looking for technical documentation, coding integration, and advancement in AI hardware and applications.)
OpenAccess AI Collective (axolotl) Discord Summary
-
Jupyter Configuration Chaos: Users reported issues with Jupyter notebooks, highlighting error messages concerning extension links and a āBad config encountered during initializationā without a conclusive solution in the discussion.
-
BitNet b1.58 Breakthroughs: An arXiv paper introduced BitNet b1.58, a 1-bit LLM that matches the performance of full-precision models, heralding significant cost-efficiency with an innovative architecture.
-
Sophia Speeds Past Adam: The Sophia optimizer, claimed to be twice as fast as Adam algorithms, was shared alongside its implementation code, sparking interest in its efficiency for optimization methods in AI models.
-
DropBP Drops Layers for Efficiency: A study presented Dropping Backward Propagation (DropBP), a method that can potentially reduce computational cost in neural network training by skipping layers during backward propagation without significantly affecting accuracy.
-
Scandinavian Showdown: Mistral vs. ChatGPT 3.5: A user, le_mess, reported that their 7B Mistral model rivaled ChatGPT 3.5 in performance for Danish language tasks, using an iterative synthetic data approach for progressive training through 30 iterations and initial human curation.
LlamaIndex Discord Summary
- Groqās Integration Powers Up LlamaIndex: The Groq LPU now supports LlamaIndex, including
llama2
andMixtral
models, aimed at improving Large Language Model (LLM) generation with a comprehensive cookbook guide provided for streamlining application workflows. - LlamaIndex Services Expand and Optimize: LlamaParse reported significant usage leading to a usage cap increase and updates towars uncapped self-serve usage, while a new strategy using LLMs for alpha parameter adjustment in hybrid search has been shared in this insight. Plus, a RAG architecture combining structured and unstructured data by
@ClickHouseDB
has been highlighted, which can be read about here. - Technical Insights and Clarifications Heat Up LlamaIndex Discussions: Indexing the latest LlamaIndex docs is in consideration with mendable mentioned as a tool for docs, while
@cheesyfishes
comments on an anticipated refactor ofCallbackHandler
in Golang. A combination of FlagEmbeddingReranker with CohereReranker was identified as a tactic despite the absence of comparison metrics, and@cheesyfishes
explained that while LlamaIndex serves data to LLMs, Langchain is a more encompassing library. - Model Behaviors Questioned Within AI Community: Thereās a discussion about model decay with
@.sysfor
noting degrading outputs from their models and@cheesyfishes
reinforcing that models do not decay but input issues can affect performance. The concern extends to fine-tuned models underperforming when compared to baseline models.
OpenRouter (Alex Atallah) Discord Summary
-
Claude Encounters a Conversational Hiccup: Claude models from Anthropics were reported to have an error with chats having more than 8 alternating messages. The problem was acknowledged by
@louisgv
with a promise of an upcoming fix. -
Turn Taking Tweaks for OpenRouter:
@alexatallah
suggested a workaround for Claudeās prompt errors involving changing the initial assistant message to a system message. Development is ongoing to better handle conversations initiated by the assistant. -
OpenRouterās Rate Limit Relay: When asked about rate limits for article generation,
@alexatallah
clarified that individually assigned API keys for OpenRouter users would have separate limits, presumably allowing adequate collective throughput. -
Mistralās Suspected Caching Unearthed: Users noticed repeat prompt responses from Mistral models suggesting caching might be at play.
@alexatallah
confirmed the possibility of query caching in Mistralās API. -
Prepaid Payment Puzzles for OpenRouter:
@fakeleiikun
raised a question about the acceptance of prepaid cards through OpenRouter, and@louisgv
responded with possible issues tied to Stripeās fraud prevention mechanisms, indicating mixed support.
CUDA MODE Discord Summary
-
Benchmarking Bounties:
@hdcharles_74684
improved a benchmark script for Triton kernels, which may outperform cuBLAS in specific scenarios such as batch sizes greater than 1, pertinent to applications like sdxl-fast. In light of potential Triton optimizations, focusing on technologies such as Torch.compile could address bottlenecks when handling batch size of 2. -
Triton Turmoil and Triumphs: Users encountered debugging issues with Triton versions 3.0.0 and 2.2.0; a workaround involved setting the
TRITON_INTERPRET
environment variable. Moreover, stability concerns were voiced regarding Tritonās unpredictable segfaults compared to CUDA, prompting a request for comparative examples to understand the inconsistencies. -
FP8 Intrinsics Intact: In response to a query based on a tweet,
@zippika
clarified that FP8 intrinsics are still documented in the CUDA math API docs, noting that FP8 is primarily a data format and not universally applied for compute operations. -
Compiler Conundrums: In the realm of deep learning, skepticism was expressed about the usefulness of polyhedral compilation for optimizing sharding. This ties into the broader discussion about defining cost functions, the complexity of mapping DL programs to hardware, and whether top AI institutions are tackling these optimization challenges.
-
Ring Attention Riddles: A comparison was proposed for validating the correctness and performance of Ring Attention implementations, as potential bugs were noted in the backward pass, and GPU compatibility issues surfaced. User
@iron_bound
suggested there may be breakage in the implementation per commit history analysis, stressing the need for careful code review and debugging.
Interconnects (Nathan Lambert) Discord Summary
-
European Independence and Open-Weight Ambitions: Arthur Mensch emphasized the commitment to open-weight models, specifically mentioning 1.5k H100s, and highlighted a reselling deal with Microsoft. Le Chat and Mistral Large are attracting attention on La Plateforme and Azure, showing growth and a quick development approach. Here are the details.
-
Starcoder2 Breaks New Ground: The Stack v2, featuring over 900B+ tokens, is the powerhouse behind StarCoder2, which flaunts a 16k token context and is trained on more than 4T+ tokens. It represents a robust addition to the coding AI community with fully open code, data, and models. Explore StarCoder2.
-
Metaās Upcoming Llama 3: A report from Reuters indicates that Meta is gearing up to launch Llama 3 in July, signaling a potential shake-up in the AI language model landscape. The Information provided additional details on this forthcoming release. Further information available here.
-
DeepMind CEOās Insights Captivate Nathan: Nathan Lambert tuned into a podcast featuring Demis Hassabis of Google DeepMind, covering topics such as superhuman AI scaling, AlphaZero combining with LLMs, and the intricacies of AI governance. These insights are accessible on various platforms including YouTube and Spotify.
-
Open AI and Personal Perspectives: The conversation between Nathan and Mike Lambert touched on the nature and importance of open AI and the differing thought models when compared to platforms like Twitter. Additionally, Mike Lambert, associated with Anthropic, expressed a preference to engage in dialogues personally rather than as a company representative.
LLM Perf Enthusiasts AI Discord Summary
- A Buzz for Benchmarking Automation: Engineers
@ampdot
and@dare.ai
are keen on exploring automated benchmark scripts, with the latter tagging another user for a possible update on such a tool. - Springtime Hopes for Llama 3:
@res6969
predicts a spring release for Llama 3, yet hints that the timeline could stretch, while@potrock
is hopeful for last-minute updates, particularly intrigued by the potential integration of Gemini ring attention. - The Testing Time Dilemma:
@jeffreyw128
voices the challenge of time investment needed for comprehensive testing of new LLMs, aiming for an adequate āvibe checkā on each model. - ChatGPT Search Speculation Surfaces: Rumors of an impending OpenAI update to ChatGPTās web search features were mentioned by
@jeffreyw128
, with@res6969
seeking more reliable OpenAI intel and curious about resources for deploying codeinterpreter in production.
DiscoResearch Discord Summary
-
DiscoLM Template Usage Critical:
@bjoernp
underscored the significance of utilizing the DiscoLM template for proper chat context tokenization, pointing to the chat templating documentation on Hugging Face as a crucial resource. -
Chunking Code Struggles with llamaindex:
@sebastian.bodza
encountered severe issues with the llamaindex chunker for code, which is outputting one-liners despite thechunk_lines
setting, suggesting a bug or a need for tool adjustments. -
Pushing the Boundaries of German AI:
@johannhartmann
is working on a German RAG model using Deutsche Telekomās data, seeking advice on enhancing the German-speaking Mistral 7b model reliability, while@philipmay
delved into generating negative samples for RAG datasets by instructing models to fabricate incorrect answers. -
German Language Models Battleground: A debate emerged over whether Goliath or DiscoLM-120b is more adept at German language tasks, with
@philipmay
and@johannhartmann
weighing in;@philipmay
posted the Goliath model card on Hugging Face for further inspection. -
Benchmarking German Prompts and Models:
@crispstrobe
revealed that EQ-Bench now includes German prompts, with the GPT-4-1106-preview model leading in performance, and provided a GitHub pull request link; they mentioned translation scripts being part of the benchmarks, effectively translated by ChatGPT-4-turbo.
Datasette - LLM (@SimonW) Discord Summary
- JSON Judo Techniques Remain Hazy:
@dbreunig
verbalized the common challenge of dealing with noisy JSON responses, though specifics on the cleanup techniques or functions were not disclosed. - Silencing Claudeās Small Talk:
@justinpinkney
recommended using initial characters like<rewrite>
based on Anthropicās documentation to circumvent Claudeās default lead-in phrases such as āSure hereās aā¦ā. - Brevity Battle with Claude:
@derekpwillis
experimented with several strategies for attaining shorter outputs from Claude, including forcing the AI to begin responses with{
, but admitted that Claude still tends to include prefatory explanations.
Skunkworks AI Discord Summary
An Unexpected Recruitment Approach: User .papahh
directly messaged @1117586410774470818, indicating a job opportunity and showing enthusiasm for their potential involvement.
Alignment Lab AI Discord Summary
- Value Hunting Across Species:
@taodoggy
is inviting collaboration on a project to probe into the biological and evolutionary origins of shared values among species, refine value definitions, and explore their manifestation in various cultures. The project overview is accessible via a Google Docs link.
PART 2: Detailed by-Channel summaries and links
TheBloke ā· #general (1070 messagesš„š„š„):
- Discord Detects Spammer: Users noticed messages flagged for likely spam in the chat, particularly from
@kquant
, who was reported for messaging over 100 people with the same message, triggering Discordās spam detection system. - Exploring ChatGPT Performance: Users like
@itsme9316
and@notreimu
discussed their varying experiences with ChatGPT models. Some noted that GPT-4ās API was unreliable for them compared to alternatives like Mixtral or Miqu models. - Model Merging Conversations: Various users, including
@itsme9316
and@al_lansley
, discussed model merging and how it doesnāt always result in smarter models. There was consensus that merging often depends on luck and the modelsā compatibility. - Concerns Over Contaminated Training Data: Users such as
@itsme9316
expressed concerns about modern LLMs potentially being contaminated with outputs from other models like OpenAIās, which could affect quality and reliability. - Quantization and Model Performance: There was discussion led by
@notreimu
and@aiwaldoh
about the performance differences between high-parameter models with low bit-per-weight (bpw) quantization and smaller models with higher bpw. Users shared varying experiences with different quantized models.
Links mentioned:
- Database Search: Search our database of leaked information. All information is in the public domain and has been compiled into one search engine.
- A look at Appleās new Transformer-powered predictive text model: I found some details about Appleās new predictive text model, coming soon in iOS 17 and macOS Sonoma.
- Microsoft-backed OpenAI valued at $80bn after company completes deal: Company to sell existing shares in ātender offerā led by venture firm Thrive Capital, in similar deal as early last year
- Sad GIF - Sad - Discover & Share GIFs: Click to view the GIF
- writing-clear.png Ā· ibm/labradorite-13b at main: no description found
- And death shall have no dominion: And death shall have no dominion. / Dead men naked they shall be one
- NousResearch/Nous-Hermes-2-Mistral-7B-DPO Ā· Hugging Face: no description found
- Uncensored Models: I am publishing this because many people are asking me how I did it, so I will explain. https://huggingface.co/ehartford/WizardLM-30B-Uncensored https://huggingface.co/ehartford/WizardLM-13B-Uncensoreā¦
- BioMistral/BioMistral-7B Ā· Hugging Face: no description found
- NousResearch/Nous-Hermes-2-SOLAR-10.7B Ā· Hugging Face: no description found
- adamo1139 (Adam): no description found
- p1atdev/dart-v1-sft Ā· Hugging Face: no description found
- google/gemma-7b-it Ā· Buggy GGUF Output: no description found
- Attack of the stobe hobo.: Full movie. Please enjoy. Rip Jim Stobe.
- Fred again..: Tiny Desk Concert: Teresa Xie | April 10, 2023When Fred again.. first proposed a Tiny Desk concert, it wasnāt immediately clear how he was going to make it work ā not because hā¦
- My Fingerprint- Am I Unique ?: no description found
- GitHub - MooreThreads/Moore-AnimateAnyone: Contribute to MooreThreads/Moore-AnimateAnyone development by creating an account on GitHub.
- adamo1139/rawrr_v2 Ā· Datasets at Hugging Face: no description found
TheBloke ā· #characters-roleplay-stories (511 messagesš„š„š„):
-
LLM Roleplay Discussion: Users discussed the effectiveness of using Large Language Models (LLMs) for role-playing characters, including techniques for crafting character identities, such as telling the LLM āyou are a journalistā to improve performance.
@nathaniel__
suggested successful strategies involve assigning roles and detailed personalities and@maldevide
shared a prompt structuring approach using#define
syntax. -
Character Consistency: Several users, including
@shanman6991
and@superking__
, explored whether character consistency can be improved by giving LLMs detailed backstories and personality traits. There was particular interest in techniques to allow characters to lie or scheme convincingly within role-play scenarios. -
Prompt Engineering Tactics:
@maldevide
discussed the use of proper names and declarative statements in prompts to guide LLMs into desired patterns of conversation, while@superking__
provided examples of instruct vs. pure chat mode setups for better model guidance. -
Model Selection for Roleplay: Users like
@superking__
indicated a preference for specific models such as miqu and mixtral for role-play purposes, often eschewing the use of system prompts. There was also mention of the potential for models to become less coherent with longer context lengths, and strategies to offset this were discussed. -
Naming Conventions in LLMs:
@gryphepadar
and@maldevide
observed that certain names, like āLyraā and āLilyā, seem to be particularly common in responses when LLMs are prompted to generate character names, leading to some speculation about the training dataās influence on these naming trends.
Links mentioned:
- Let Me In Eric Andre GIF - Let Me In Eric Andre Wanna Come In - Discover & Share GIFs: Click to view the GIF
- Sad Smoke GIF - Sad Smoke Pinkguy - Discover & Share GIFs: Click to view the GIF
- Why Have You Forsaken Me? GIF - Forsaken Why Have You Forsaken Me Sad - Discover & Share GIFs: Click to view the GIF
- maldv/conversation-cixot Ā· Datasets at Hugging Face: no description found
- Hawk Eye Dont Give Me Hope GIF - Hawk Eye Dont Give Me Hope Clint Barton - Discover & Share GIFs: Click to view the GIF
- GitHub - UltiRTS/PrometheSys.vue: Contribute to UltiRTS/PrometheSys.vue development by creating an account on GitHub.
- GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs - predibase/lorax
TheBloke ā· #training-and-fine-tuning (86 messagesš„š„):
- Perplexity AI as a New Tool: User
@icecream102
suggested trying out Perplexity AI as a resource. - Budget Training with QLoRA:
@dirtytigerx
advised that training large language models like GPT can be expensive and suggested using techniques like QLoRA to limit hardware requirements, though noting it would still take many hours of compute. - Training and Inference Cost Estimates: In a discussion on estimating GPU hours for training and inference,
@dirtytigerx
recommended conducting a tiny test run and looking at published papers for benchmarks. - Model Training Dynamics Discussed:
@cogbuji
questioned training a model with a static low validation loss, prompting@dirtytigerx
to suggest altering the validation split and taking deduplication steps to address discrepancies. - Model Generalization and Hallucination Concerns:
@dirtytigerx
and@cogbuji
discussed training model generalization and the inevitable problem of hallucination during inference, suggesting the use of retrieval mechanisms and further evaluation strategies.
Links mentioned:
cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 Ā· Hugging Face: no description found
TheBloke ā· #model-merging (6 messages):
- Tensor Dimension Misalignment Issue:
@falconsfly
pointed out that an issue arose due to a single bit being misplaced or misaligned, resulting in incorrect tensor dimensions. - Appreciation Expressed for Information:
@222gate
thanked@falconsfly
for sharing the information about the tensor dimension problem. - Query about Slerp or Linear Techniques:
@222gate
asked if the discussed merging techniques involved spherical linear interpolation (slerp) or just linear ties. - Reflection on Diffusion Test Techniques: In response,
@alphaatlas1
mentioned not being certain about@222gate
ās specific query but shared that their diffusion test used dare ties and speculated that a HuggingFace test may have involved dare task arithmetic. - Recommendation for Concatenation in Merging:
@alphaatlas1
suggested trying concatenation for anyone doing the peft merging, stating it works well and noting thereās no full-weight merging analogue for it.
TheBloke ā· #coding (8 messagesš„):
-
Eager for Collaboration:
@wolfsauge
expresses enthusiasm to learn from@falconsfly
, anticipating a discussion on fresh ideas for enhancement after dinner. -
No GPU, No Speed?:
@dirtytigerx
states that without a GPU, speeding up processes is challenging, offering no alternative solutions for performance improvement. -
APIs for Acceleration:
@tom_lrd
suggests using APIs as an alternative to speed up processes, listing multiple services like huggingface, together.ai, and mistral.ai. -
Looking Beyond Colab for Hosted Notebooks: Despite
@dirtytigerx
mentioning the lack of hosted notebooks on platforms provided by cloud providers,@falconsfly
points out that Groq.com offers fast inference. -
Modular MAX Enters the Game:
@dirtytigerx
shares news about the general availability of the modular MAX platform, announcing the developer edition preview and its vision to democratize AI through a unified, optimized infrastructure.
Links mentioned:
Modular: Announcing MAX Developer Edition Preview: We are building a next-generation AI developer platform for the world. Check out our latest post: Announcing MAX Developer Edition Preview
Mistral ā· #general (992 messagesš„š„š„):
-
NVIDIAās Chat with RTX Demo Criticized: Users like
@netrve
expressed disappointment with NVIDIAās āChat with RTXā demo, which was meant to showcase retrieval-augmented generation (RAG) capabilities. The demo, which limited context size to 1024 tokens, faced issues with retrieving correct information and delivering coherent answers. NVIDIAās use of LangChain in the reference architecture for RAG was also questioned. -
OpenAI and Meta Licensing Discussions: There was a heated discussion spearheaded by
@i_am_dom
and@netrve
regarding Mistral AIās usage of Metaās LLaMa model, potential licensing issues, and implications of commercial use. The consensus suggested that an undisclosed agreement between Mistral and Meta was possible, given the seeming compliance with Metaās licensing terms. -
Conversations about Mistral AIās Open Weight Models:
@mrdragonfox
,@tarruda
, and others discussed Mistral AIās commitment to open weight models and speculated about future releases following the Mistral-7B model. The community expressed trust and expectations towards Mistral for providing more open weight models. -
RAG Implementation Challenges Highlighted: Several users, including
@mrdragonfox
and@shanman6991
, discussed the complexities of implementing RAG systems effectively. They mentioned the significant impact of the embedding model on RAG performance and the difficulty in achieving perfection with RAG, often taking months of refinement. -
Mistral AI and Microsoft Deal Scrutinized: An investment by Microsoft in Mistral AI raised discussions about the size of the investment and its implications for competition in the AI space.
@ethux
shared information hinting that the investment was minimal, while@i_am_dom
raised concerns about Microsoftās cautious approach due to potential complexities with open-source models like Miqu.
Links mentioned:
- What Is Retrieval-Augmented Generation aka RAG?: Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys: no description found
- Klopp Retro GIF - Klopp Retro Dancing - Discover & Share GIFs: Click to view the GIF
- Basic RAG | Mistral AI Large Language Models: Retrieval-augmented generation (RAG) is an AI framework that synergizes the capabilities of LLMs and information retrieval systems. Itās useful to answer questions or generate content leveraging ā¦
- mlabonne/NeuralHermes-2.5-Mistral-7B Ā· Hugging Face: no description found
- Legal terms and conditions: Terms and conditions for using Mistral products and services.
- Microsoft made a $16M investment in Mistral AI | TechCrunch: Microsoft is investing ā¬15 million in Mistral AI, a Paris-based AI startup working on foundational models.
- Client code | Mistral AI Large Language Models): We provide client codes in both Python and Javascript.
- NVIDIA Chat With RTX: Personnalisez et dĆ©ployez votreĀ chatbotĀ dāIA.
- Microsoft made a $16M investment in Mistral AI | TechCrunch: Microsoft is investing ā¬15 million in Mistral AI, a Paris-based AI startup working on foundational models.
- Mistral Large vs GPT4 - Practical Benchmarking!: ā”ļø One-click Fine-tuning & Inference Templates: https://github.com/TrelisResearch/one-click-llms/ā”ļø Trelis Function-calling Models (incl. OpenChat 3.5): httpā¦
- Short Courses: Take your generative AI skills to the next level with short courses fromĀ DeepLearning.AI. Enroll today to learn directly from industry leaders, and practice generative AI concepts via hands-on exercisā¦
Mistral ā· #models (12 messagesš„):
- More Meaningful Error Messages on Mistral:
@lerela
addressed an issue regarding system limitations, stating that a certain operation is not permitted with the large model, but users will now receive a more meaningful error message. - Discussion on System/Assistant/User Sequence:
@skisquaw
remarked on having to change the sequence from system/assistant/user to user/assistant/user due to the model treating the first user input as a system one, despite a functionality need where assistant prompts follow system commands. - Quantization Packs Mistral-7B Parameters:
@chrismccormick_
inquired about the parameter count of Mistral-7B, originally tallying only around 3.5B. They later deduced that 4-bit quantization likely halves the tensor elements. - Large Code Segments Questioned for Mistral:
@frigjord
contemplated whether querying long code segments, especially more than 16K tokens, might pose a problem for Mistral models. - Complex SQL Queries with Mistral-7B:
@sanipanwala
asked about generating complex SQL queries with Mistral-7B, and@tom_lrd
responded affirmatively, providing advice on formulating the queries and even giving an example for creating a sophisticated SQL query.
Mistral ā· #deployment (174 messagesš„š„):
-
Mistral Deployment Conundrum:
@arthur8643
inquired about hardware requirements for running Mistral 8x7B locally, contemplating a system upgrade. Users@_._pandora_._
and@mrdragonfox
advised that his current setup wouldnāt suffice, recommending at least 100GB of VRAM for full precision deployment, and suggesting the use of services like together.ai for assistance. -
Debates on Optimal Server Specs:
@latoile0221
sought advice on server specifications for token generation, considering a dual CPU setup and RTX 4090 GPU. The user received mixed responses regarding the importance of CPU versus GPU;@ethux
stressed the GPUās significance for inference tasks while discussions circled around the necessity of substantial VRAM for full precision models. -
Quantization Qualms and GPU Capabilities: Various participants expressed that quantized models underperform, with
@frigjord
and@ethux
noting that quantized versions arenāt worthwhile for coding tasks. The consensus emerged that substantial VRAM (near 100GB) is needed to run non-quantized, full-precision models effectively. -
Self-Hosting, Model Types, and AI Limitations: Dialogue ensued about the practicalities of self-hosting AI models like Mixtral, with mentions of utilizing quant versions and alternatives like GGUF format. Users including
@ethux
and@sublimatorniq
shared experiences, with a focus on the limitations of quantized models and better performance of full models on high-spec hardware. -
On the Topic of Specialized AI Models: The discussion touched on the potential advantages and challenges of training a specialized JS-only AI model.
@frigjord
and@mrdragonfox
debated the effectiveness and handling of such focused models, with general agreement on the extensive work required to clean and prep datasets for any specialized AI training.
Links mentioned:
- Jurassic Park GIF - Jurassic Park World - Discover & Share GIFs: Click to view the GIF
- starling-lm: Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
- Tags Ā· mixtral: A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.
Mistral ā· #ref-implem (76 messagesš„š„):
- Typo Alert in Notebook:
@foxalabs_32486
identified a typo in theprompting_capabilities.ipynb
notebook, where an extra āorā was present. The correct text should read āFew-shot learning or in-context learning is when we give a few examples in the promptā¦ā - Fix Confirmation: In response to
@foxalabs_32486
ās notice,@sophiamyang
acknowledged the error and confirmed the fix. - Typos Add Human Touch:
@foxalabs_32486
mused about using occasional typos to make AI-generated content appear more human, sparking a discussion on the ethics of making AI seem human with@mrdragonfox
. - Ethics over Earnings:
@mrdragonfox
declined projects aimed at humanizing AI beyond ethical comfort, underscoring a preference to choose integrity over financial gain. - AI Industry Hiring Challenges:
@foxalabs_32486
discussed the difficulties in hiring within the AI industry due to a shortage of skilled professionals and the rapid expansion of knowledge required.
Mistral ā· #finetuning (15 messagesš„):
- Limiting Model Answers to Specific Documents:
@aaronbarreiro
inquired about constraining a chatbot to only provide information from a specific document, such as one about wines, and not respond about unrelated topics like pizza. - The Challenge of Controlling LLMS:
@mrdragonfox
explained that language models like LLMS will likely hallucinate answers, because they are designed fundamentally as next token predictors, thus a robust system prompt is vital to direct responses. - Language Models as Stateless Entities:
@mrdragonfox
highlighted the stateless nature of language models, meaning they donāt retain memory like a human would, and if pushed beyond their token limitāspecifically mentioned the 32k contextāthey will forget earlier information. - Strategies to Maintain Context Beyond Limits:
@mrdragonfox
discussed strategies to circumvent the context limitation, such as using function calling or retrieval-augmented generation (RAG), but acknowledged these methods are more complex and donāt work directly out-of-the-box. - Fine-Tuning Time Depends on Dataset Size: When
@atip
asked about the time required to fine-tune a 7B parameter model on H100 hardware,@mrdragonfox
stated it varies based on dataset size, implying the duration canāt be estimated without that information.
Mistral ā· #showcase (7 messages):
-
Teaching Economics with AI:
@patagonia50
shared about creating an app for an intermediate microeconomics course that provides instant personalized feedback by making API calls to gpt-4-vision-preview and Mistral models. The app, which adapts to different questions and rubrics via a JSON file, has been deployed on Heroku and is still being refined, with future plans to expand its capabilities with Mistral AI models. -
Interest Expressed in Educational App:
@akshay_1
showed interest in@patagonia50
ās educational app, asking if there was a GitHub repository available for it. -
Open Source Plans: In response to
@akshay_1
,@patagonia50
indicated that there isnāt a GitHub repository yet but plans to create one for the educational app. -
Request for a Closer Look:
@akshay_1
expressed a desire for a sneak peek at@patagonia50
ās educational app, demonstrating enthusiasm for the project.
Links mentioned:
- cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 Ā· Hugging Face: no description found
- Use Mistral AI Large Model Like This: Beginner Friendly: We learn the features of High Performing Mistral Large and do live coding on Chat Completions with Streaming and JSON Mode. The landscape of artificial intelā¦
Mistral ā· #random (2 messages):
- Seeking the Google Million Context AI: User
@j673912
inquired about how to access the elusive Google 1M Context AI. - Insider Connection Required:
@dawn.dusk
recommended having direct contact with someone from Deepmind to gain access.
Mistral ā· #la-plateforme (41 messagesš„):
-
Mistral Function Calls Require Adjustments:
@michaelhunger
discussed challenges with the Mistral function calling mechanism, noting the need for patches and system messages. Specifically, Mistralās behavior contrasts with expectations, often preferring additional tool calls over answering the userās query directly. -
Clarifying
tool_choice
Behavior:@liebke
expressed confusion over the behavior oftool_choice="auto"
in the context of Mistralās function calling, as the setting does not seem to trigger tool calls as anticipated.@sophiamyang
suggested that āautoā should work as intended, prompting a request for Liebkeās implementation details for further troubleshooting. -
Inconsistencies in Mistral Function Calling:
@alexclubs
provided feedback on integrating Mistral Function Calling into Profound Logic, noticing differences from OpenAIās tool behavior and a lack of consistency in when functions are triggered. -
Reproducibility of Outputs on Mistralās Platform Uncertain:
@alexli3146
inquired about seedable outputs for reproducibility, while@foxalabs_32486
and@sublimatorniq
discussed potential issues and existing settings in the API that may affect it. -
Mistral Message Roles Must Follow Specific Order: After discussing error messages encountered with āmistral-large-latest,ā
@not__cool
discovered that wrapping a user message with two system messages is not supported, as confirmed by@lerela
. However,@skisquaw
successfully used the user/assistant format with the system role message in the first user role statement.
Links mentioned:
- Technology: Frontier AI in your hands
- AI Assistants are the Future | Profound Logic: With Profound AI, you can enhance your legacy applications with natural language AI assistants in just 3 steps.
- AI Assistants are the Future | Profound Logic.): With Profound AI, you can enhance your legacy applications with natural language AI assistants in just 3 steps.
- GitHub - liebke/mechanician: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use.: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use. - liebke/mechanician
- mechanician/packages/mechanician_mistral/src/mechanician_mistral/mistral_ai_connector.py at main Ā· liebke/mechanician: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use. - liebke/mechanician
- mechanician/examples/notepad/src/notepad/main.py at main Ā· liebke/mechanician: Daring Mechanician is a Python library for building tools that use AI by building tools that AIs use. - liebke/mechanician
Mistral ā· #office-hour (1 messages):
- Mark Your Calendars for Evaluation Talk:
@sophiamyang
invites everyone to the next office hour on Mar. 5 at 5pm CET with a focus on evaluation and benchmarking. They express interest in learning about different evaluation strategies and benchmarks used by participants.
Mistral ā· #le-chat (423 messagesš„š„š„):
-
Le Chat Model Limit Discussions: User
@alexeyzaytsev
inquired about the limits for Le Chat on a free account. Although currently undefined,@ethux
and@_._pandora_._
speculated that future restrictions might mimic OpenAIās model, with advanced features potentially becoming paid services. -
Mistral on Groq Hardware:
@foxalabs_32486
asked about plans to run Large on Groq hardware, while@ethux
noted Groqās memory limitations.@foxalabs_32486
provided a product brief from Groq, highlighting potential misconceptions about their hardwareās capabilities. -
Mistralās Market Position and Microsoft Influence: In an extensive discussion, users
@foxalabs_32486
and@mrdragonfox
shared their perceptions of Mistralās market positioning and the influence of Microsoftās investment. They touched on topics like strategic hedging, the potential impact on OpenAI, and the speed of Mistralās achievements. -
Feedback for Le Chat Improvement: Several users, including
@sophiamyang
, engaged in discussing ways to improve Le Chat. Suggestions included a āthumb downā button for inaccurate responses (@jmlb3290
), ease of switching between models during conversations (@sublimatorniq
), features to manage token counts and conversation context (@_._pandora_._
), preserving messages on error (@tom_lrd
), and support for image inputs (@foxalabs_32486
). -
Debating Efficiency of Low-Bitwidth Transformers: Users, especially
@foxalabs_32486
and@mrdragonfox
, debated the implications of a low-bitwidth transformer research paper, discussing potential boosts in efficiency and the viability of quickly implementing these findings. They mentioned the work involved in adapting existing models and the speculative nature of immediate hardware advancements.
Links mentioned:
- Technology.): Frontier AI in your hands
- Why 2024 Will Be Not Like 2024: In the ever-evolving landscape of technology and education, a revolutionary force is poised to reshape the way we learn, think, andā¦
- Unsloth update: Mistral support + more: Weāre excited to release QLoRA support for Mistral 7B, CodeLlama 34B, and all other models based on the Llama architecture! We added sliding window attention, preliminary Windows and DPO support, and ā¦
- GitHub - unslothai/unsloth: 5X faster 60% less memory QLoRA finetuning: 5X faster 60% less memory QLoRA finetuning. Contribute to unslothai/unsloth development by creating an account on GitHub.
Mistral ā· #failed-prompts (6 messages):
-
Instructions for Reporting Failed Prompts:
@sophiamyang
provided a template requesting details for reporting failed prompts, specifying information likemodel
,prompt
,model output
, andexpected output
. -
Witty Math Mistake Report:
@blueaquilae
humorously flagged an issue regarding mathematics with the Mistral Large model with their comment, āmath, halfway there (pun intended) on large chatā. -
Tongue-in-Cheek Query Confirmation: In a playful exchange,
@notan_ai
queries whether a specific example counts as a failed prompt, to which@blueaquilae
responds, āSynthetic data all the way?ā -
General Failures on le chat:
@blacksummer99
reports that all versions of Mistral, including Mistral next, fail on a prompt given on le chat, without providing specifics. -
Incomplete Issue Indication:
@aiwaldoh
mentions āFondĆ©e en 2016?!ā possibly pointing out an issue or confusion with the Mistral modelās output, but no further details are provided.
Mistral ā· #prompts-gallery (5 messages):
-
Invitation to Share Prompt Mastery: User
@sophiamyang
welcomed everyone to share their most effective prompts, emphasizing prompt crafting as an art form and looking forward to seeing usersā creations. -
Confusion About Channel Purpose: After user
@akshay_1
simply mentioned āDSPyā,@notan_ai
responded with curiosity about āSudoLangā but expressed confusion regarding the purpose of the channel. -
Possible Model Mention with Ambiguity: The model name āMistral next le chatā was mentioned twice by
@blacksummer99
, however, no further context or details were provided.
OpenAI ā· #ai-discussions (58 messagesš„š„):
-
Loader Choices for AI Models:
@drinkoblog.weebly.com
pointed out that lm studio requires manual GUI interaction to start the API, which is impractical for websites. They recommend using alternative loaders such as oobabooga or Jan dot ai for automation on boot. -
Automod Censorship on AI Discussions:
@chonkyman777
reported their message was removed for showcasing problematic behavior by Copilot AI, and@eskcanta
suggested reaching out to Discord mods via Modmail and reporting AI issues directly to OpenAI through their feedback form. Users debated the nuances of moderation and the scope of the rules in place. -
Concerns Over Mistral and Uncensored Content:
@dezuzel
shared a YouTube video discussing Mistral, an AI model considered powerful and uncensored.@tariqali
raised questions about the implications of European AI regulation on Mistral, despite its promoted lack of censorship.@chief_executive
compared Mistral Large to GPT-4 and found the latter superior for coding tasks. -
Fine-Tuning GPT-3.5 for Chatbot Use Case:
@david_zoe
sought advice on fine-tuning GPT-3.5-Turbo to perform better than the baseline and maintain conversation flow, but faced challenges matching the performance of GPT-4.@elektronisade
recommended examining common use cases and consulting ChatGPT with actual data for further guidance on fine-tuning. -
Exploring Certifications for AI Specialization:
@navs02
, a young developer, inquired about certifications for specializing in AI.@dezuzel
and.dooz
advised focusing on real-world projects over certifications and mentioned learning resources including courses by Andrew Ng and Andrej Karpathy on YouTube.
Links mentioned:
- Chat model feedback: no description found
- This new AI is powerful and uncensored⦠Letās run it: Learn how to run Mistralās 8x7B model and its uncensored varieties using open-source tools. Letās find out if Mixtral is a good alternative to GPT-4, and leaā¦
OpenAI ā· #gpt-4-discussions (21 messagesš„):
- Confusion Over API and File Uploads:
@ray_themad_nomad
expressed frustration with the chatbotās inconsistent responses after uploading files and creating custom APIs, noting that methods that worked months ago seem to fail now. - Clarifying Document Size Limitations:
@darthgustav.
pointed out that the chatbot can only read documents within context size, and it will summarize larger files, which spurred a debate with@fawesum
who suggested that knowledge files can be accessed efficiently even if they are huge. - Seed Parameters Causing Inconsistent Outputs:
@alexli3146
asked if anyone had success with getting reproducible output using the seed parameter, but shared that they havenāt. - Security Measures with Web Browsing and Code Interpreter:
@darthgustav.
explained that using python to search knowledge files with the Code Interpreter can disable web browsing in the instance which is a security decision. - Proper Channel for Sharing The Memory Game:
@takk8is
shared a link to āThe Memoryā but was redirected by@solbus
to share it in the dedicated channel to avoid it getting lost in the chat.
OpenAI ā· #prompt-engineering (391 messagesš„š„):
-
Prompt Engineering with MetaPrompting:
@madame_architect
shared their work on annotating āMetaPromptingā research, enhancing their compiled list of prompt architecture papers to 42 total. The article details a method integrating meta-learning with prompts, aimed at improving initializations for soft prompts in NLP models. MetaPrompting Discussion -
LaTeX and Katex in ChatGPT: Several users, including
@yami1010
and@eskcanta
, discussed the capabilities of ChatGPT in handling LaTeX and Katex for creating visual data representations, with a focus on math and flowchart diagrams. -
Curly Brackets Saga in DALL-E 3: Users such as
@darthgustav.
and@beanz_and_rice
encountered an issue where DALL-E 3 payloads were not accepting standard curly brackets in JSON strings. They found a workaround by using escape coded curly brackets, which appeared to bypass the parser error. -
Enhancing ChatGPT Creativity for Artistic Prompts: When asked about improving creativity in artistic prompts,
@bambooshoots
and@darthgustav.
suggested a multi-step iterative process and the use of semantically open variables to encourage less deterministic and more imaginative outputs from the AI. -
Challenges with Custom ChatGPT File Reading:
@codenamecookie
and@darthgustav.
discussed issues with Custom ChatGPTās inconsistent ability to read ā.pyā files from its knowledge. They explored potential solutions such as converting files to plain text and avoiding unnecessary zipping for better AI parsing and responsiveness.
Links mentioned:
Disrupting malicious uses of AI by state-affiliated threat actors: We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.
OpenAI ā· #api-discussions (391 messagesš„š„):
- Prompt Engineering Secrets:
@yami1010
and@eskcanta
shared insights on using Markdown, LaTeX, and KaTeX in prompts with ChatGPT for creating diagrams and flowcharts. They discussed the effectiveness of different diagram-as-code tools, with mentions of mermaid and mathplotlib, and the peculiarities of dealing with curly brackets in the DALL-E 3 parser. - MetaPrompting Annotated:
@madame_architect
added MetaPrompting to their list of 42 annotated prompt architecture papers. The list, which can be found on the AI-Empower GitHub, is maintained to keep high-quality standards and is useful for researching prompt engineering. - The Curly Brackets Saga: A long discussion revolving around the DALL-E 3 payloadās formatting issues with curly brackets (
{}
,}
) in JSON strings took place, with multiple users like@darthgustav.
and@yami1010
noting failures during image generation. A solution involving Unicode escape codes was found, bypassing the parser error. - Custom ChatGPT File Reading: In a conversation about Custom ChatGPT,
@codenamecookie
expressed confusion about the modelās inconsistent ability to read Python files from its āknowledgeā.@darthgustav.
recommended not zipping the files and converting them to plain text while maintaining Python interpretation, which might help the AI process the files better. - Boosting AI Creativity: For enhancing AI-created artistic prompts, users like
@bambooshoots
and@darthgustav.
suggested using a multi-step process to develop the scene and elicit more creative responses from GPT-3.5 and GPT-4. The inclusion of semantically open variables and iterative prompting would help provoke less deterministic and more unique outputs.
Links mentioned:
Disrupting malicious uses of AI by state-affiliated threat actors: We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.
LM Studio ā· #š¬-general (484 messagesš„š„š„):
-
Exploring Model Options: Users are discussing various LLMs and their compatibility with specific GPUs, with a focus on coding assistance models such as Deepseek Coder 6.7B and StarCoder2-15B. For example,
@solusan.
is looking for the best model to fit an Nvidia RTX 40 series with 12 GB, currently considering Dolphin 2.6 Mistral 7B. -
LM Studio GPU Compatibility Issues: Several users like
@jans_85817
and@kerberos5703
are facing issues running LM Studio with certain GPUs. Discussions revolve around LM Studioās compatibility mainly with newer GPUs, and older GPUs are presenting problems for which users are seeking solutions or alternatives. -
Hugging Face Outage Impact: A common issue reported by multiple members like
@barnley
and@heyitsyorkie
is related to a network error when downloading models due to a Hugging Face outage affecting LM Studioās ability to search for models. -
Image Recognition and Generation Queries: Questions regarding image-related tasks surfaced, and
@heyitsyorkie
clarified that while LM Studio cannot perform image generation tasks, it is possible to work with image recognition through Llava models. -
Hardware Discussions and Anticipations: Users like
@pierrunoyt
and@nink1
are discussing future hardware expectations for AI and LLMs, noting that current high-end AI-specific hardware may become more accessible with time.
Links mentioned:
- GroqChat: no description found
- no title found: no description found
- š¾ LM Studio - Discover and run local LLMs: Find, download, and experiment with local LLMs
- Stop Shouting Arnold Schwarzenegger GIF - Stop Shouting Arnold Schwarzenegger Jack Slater - Discover & Share GIFs: Click to view the GIF
- BLOOM: Our 176B parameter language model is here.
- Continue: no description found
- no title found: no description found
- GeForce GTX 650 Ti | Specifications | GeForce: no description found
- MaziyarPanahi/dolphin-2.6-mistral-7b-Mistral-7B-Instruct-v0.2-slerp-GGUF Ā· Hugging Face: no description found
- Specifications | GeForce: no description found
- 02 ā Default and Notebook Tabs: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models. - oobabooga/text-generation-webui
- Add support for StarCoder2 by pacman100 Ā· Pull Request #5795 Ā· ggerganov/llama.cpp: What does this PR do? Adds support for StarCoder 2 models that were released recently.
- bigcode/starcoder2-15b Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- Anima/air_llm at main Ā· lyogavin/Anima: 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU - lyogavin/Anima
- GitHub - MDK8888/GPTFast: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch.: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch. - MDK8888/GPTFast
- itsdotscience/Magicoder-S-DS-6.7B-GGUF at main: no description found
LM Studio ā· #š¤-models-discussion-chat (61 messagesš„š„):
-
Seeking PDF chatbot guidance:
@solenya7755
is looking to implement an accurate PDF chat bot with LM Studio and llama2 70B Q4 LLM, but experiences inaccuracies with hallucinated commands.@nink1
suggests extensive prompt work and joining the AnythingLLM discord for further assistance. -
StarCoder2 and The Stack v2 launch:
@snoopbill_91704
shares news about the launch of StarCoder2 and The Stack v2 by ServiceNow, Hugging Face, and NVIDIA, noting a partnership with Software Heritage aligned with responsible AI principles. -
Qualcomm releases 80 open source models:
@misangenius
brings attention to Qualcommās release of 80 open source AI models, for vision, audio, and speech applications available on Huggingface. -
Querying Models that prompt you with questions:
@ozimandis
inquires about local LLMs that ask questions and has mixed results with different models, while@nink1
shares success in getting models like dolphin mistral 7B q5 to ask provocative questions. -
Best setup for business document analysis and writing:
@redcloud9999
seeks advice on the best LLM setup for analyzing and writing business documents with a high-spec machine.@heyitsyorkie
advises searching for GGUF quants by āTheBlokeā on Huggingface and@coachdennis.
suggests testing trending models.
Links mentioned:
- qualcomm (Qualcomm): no description found
- bigcode/starcoder2-15b Ā· Hugging Face: no description found
- bigcode/the-stack-v2-train-full-ids Ā· Datasets at Hugging Face: no description found
- Pioneering the Future of Code Preservation and AI with StarCoder2: Software Heritageās mission is to collect, preserve, and make the entire body of software source code easily available, especially emphasizing Free and Open Source Software (FOSS) as a digital cā¦
LM Studio ā· #š-hardware-discussion (42 messagesš„):
<ul>
<li><strong>Optimization Tips for Windows 11</strong>: `.bambalejo` advised users to disable certain features like microsheet's core isolation and vm platform on Windows 11 for better performance, and to ensure <em>VirtualizationBasedSecurityStatus</em> is set to 0.</li>
<li><strong>TinyBox Announcement</strong>: `senecalouck` shared a link with details on the TinyBox from TinyCorp, a new hardware offering found <a href="https://tinygrad.org">here</a>.</li>
<li><strong>E-commerce GPU Frustrations and Specs</strong>: `goldensun3ds` recounted a negative experience purchasing a falsely advertised GPU on eBay, opting for Amazon for their next purchase, listing their robust PC specs including dual RTX 4060 Ti 16GB.</li>
<li><strong>Old Hardware Nostalgia</strong>: A string of messages from users like `jans_85817`, `nullt3r`, `heyitsyorkie`, and `666siegfried666`, reminisced about older GPUs; the conversation included insights like the GTX 650 being unfit for modern models, and personal stories of past rigs and upgrades.</li>
<li><strong>Discussion on Nvidia Nvlink / SLI</strong>: Users `dub_ex` and `nullt3r` discussed the effectiveness of Nvidia Nvlink / SLI, concluding it is beneficial for model training but not necessarily for inference.</li>
</ul>
LM Studio ā· #š§Ŗ-beta-releases-chat (7 messages):
-
Inquiring about Image Insertion in LM Studio:
@heoheo5839
was unsure about how to add an image into LM Studio as the āAssetsā bar wasnāt visible.@heyitsyorkie
explained that to add an image, one must use a model likePsiPi/liuhaotian_llava-v1.5-13b-GGUF/
, ensure both the vision adapter (mmproj) and gguf of the model are downloaded, after which the image can be inserted in the input box for the model to describe. -
Questions about llava Model Downloads:
@hypocritipus
queried about the possibility of downloading llava supported models directly within LM Studio, alluding to easier accessibility and functionality. -
Clarifying llava Model Functionality in LM Studio:
@wolfspyre
questioned whether downloading llava models is a current functionality, suggesting that it might already be supported within LM Studio. -
Confirming Vision Adapter Model Use: In response to
@wolfspyre
,@hypocritipus
clarified they hadnāt tried to use the functionality themselves and were seeking confirmation on whether it was feasible to download both the vision adapter and the primary model simultaneously within LM Studio. -
Exploring One-Click Downloads for Vision-Enabled Models:
@hypocritipus
shared an excerpt from the release notes indicating that users need to download a Vision Adapter and a primary model separately. They expressed curiosity about whether there is a one-click solution within LM Studio to simplify this process, where users could download both necessary files with a single action.
Links mentioned:
- Vision Models (GGUF) - a lmstudio-ai Collection: no description found
- Tweet from LM Studio (@LMStudioAI): Counting penguins can be challenging š§š§ New in LM Studio 0.2.9: š Local & Offline Vision Models! In this demo: the small and impressive Obsidian Vision 3B by @NousResearch.
LM Studio ā· #autogen (7 messages):
- Gemini vs. ChatGPT in Translation Tasks:
@hypocritipus
shared their experience using Gemini and ChatGPT for translating psychological evaluation reports from Turkish to English, noting that Gemini generally provided better translations. - Struggle with Geminiās Overzealous Formatting:
@hypocritipus
expressed frustration with Geminiās tendency to add unnecessary bullet points and its habit of hallucinating content beyond the requested translation. - ChatGPT to the Rescue, Sort of: For the final report,
@hypocritipus
had to switch to ChatGPT due to Gemini not delivering as expected, though they mentioned that ChatGPTās translation was inferior. - Accidental Message in Autogen:
@hypocritipus
humorously noted they posted their experience in the Autogen channel by mistake, highlighted by a āLMFAO wrong place for me to post thisā¦ā comment. - Confusion Cleared Up:
@johnnyslanteyes
asked for clarification on what@hypocritipus
meant by ātranslationā of the reports, which led to the explanation that it was a language translation from Turkish to English, not a conversion of medical jargon.
LM Studio ā· #langchain (3 messages):
- Dimensionality Details Disclosed: User
@npcomp_22591
mentioned having positive outcomes using 768 dimensions for vectors. - Vectors 101: In response to an inquiry from
@bigsuh.eth
on how to check vector dimensions,@npcomp_22591
briefly explained the process: you can check the dimensionality of a vector by examining its length, providing an example output followed by.length
.
LM Studio ā· #memgpt (1 messages):
jans_85817: i am are waiting that lm studio version for linux
HuggingFace ā· #announcements (1 messages):
-
Cosmopedia Unleashed:
@lunarflu
announced the release of Cosmopedia, touting it as the largest open synthetic dataset of textbooks, blogposts, and stories created by Mixtral with over 25B tokens and 30M files. Available resources linked through LinkedIn post. -
huggingface_hub
Library Updates: The newhuggingface_hub
library version 0.21.0 release was highlighted, featuring dataclasses,PyTorchHubMixin
support, andaudio-to-audio
inference among other updates. Developers can view the full release notes at the huggingface space. -
New Methods and Models on the Horizon: The posts shared exciting developments, including training a DoRA using diffusers script, pushing Figma frames to a dataset, and the debut of YOLOv9 on the hub with compatibility confirmed for Transformers.js. Additional updates covered
sentence-transformers
v2.4.0, the LGM Mini project, and the possibility to run AWQ models on AMD GPUs. -
Innovations in Product: Googleās open LLM Gemma 7B is now available on Hugging Chat,
transformers
released a new task guide for mask generation, and a newimage-feature-extraction
tag was introduced, highlighting a model likegoogle/vit-base-patch16-224-in21k
. -
Community Collaboration and Contributions: Community efforts led to the release of datasets such as
#data-is-better-together
ās10k_prompts_ranked
, andOpenHermesPreferences
. Furthermore, TTS Arena was launched for testing and rating text-to-speech models, and Fine-Tuning Gemma Models guide was made available on Hugging Faceās blog.
Links mentioned:
- @Wauplin on Hugging Face: āš Just released version 0.21.0 of the
huggingface_hub
Python library!ā¦ā: no description found - Tweet from Victor M (@victormustar): 𤯠This @figma plugin lets you push your figma frames directly into a @huggingface dataset!
- Tweet from merve (@mervenoyann): YOLOv9 arrived on @huggingface Hub! 𤩠The model checkpoints: https://huggingface.co/merve/yolov9 Try the demo (@kadirnar_ai): https://huggingface.co/spaces/kadirnar/Yolov9 Find demo for YOLOv9 porā¦
- Tweet from Xenova (@xenovacom): YOLOv9 just released, and now itās compatible with š¤ Transformers.js! Thatās right⦠near real-time object detection running locally in your browser: no server required! 𤯠Try it out yoursā¦
- Tweet from Omar Sanseviero (@osanseviero): Matryoshka Embeddings are here! š„ The Sentence Transformers library allows training and running embedding models with embedding sizes that can be shrunk while keeping high quality! Learn about themā¦
- Tweet from dylan (@dylan_ebert_): LGM Mini š§ Image to Interactive 3D in 5 seconds https://huggingface.co/spaces/dylanebert/LGM-mini
- Tweet from Julien Chaumond (@julien_c): BREAKING: āļø Quoting Victor M (@victormustar) ⨠Googleās new open LLM Gemma 7B is now available on HuggingChat.
- Tweet from merve (@mervenoyann): š¤ transformers has a new task guide for mask generation (also known as zero-shot image segmentation) learn how to use the powerful segment-anything models in this guide https://huggingface.co/docs/tā¦
- Models - Hugging Face: no description found
- DIBT/10k_prompts_ranked Ā· Datasets at Hugging Face: no description found
- @davanstrien on Hugging Face: āThe open-source AI community can build impactful datasets collectively!ā¦ā: no description found
- Tweet from Lewis Tunstall (@_lewtun): šŖ½Introducing OpenHermesPreferences - the largest dataset of ~1 million AI preferences generated by Mixtral and Nous-Hermes-2-Yi-34B š„ https://huggingface.co/datasets/argilla/OpenHermesPreferences ā¦
- Tweet from Vaibhav (VB) Srivastav (@reach_vb): Announcing TTS Arena! š£ļø sound on One place to test, rate and find the champion of current open models. A continually updated space with the greatest and the best of the current TTS landscape! ā”ā¦
- Introducing the Red-Teaming Resistance Leaderboard: no description found
- AI Watermarking 101: Tools and Techniques: no description found
- Fine-Tuning Gemma Models in Hugging Face: no description found
- Tweet from Bassem Asseh š¤ (@asseh): .@huggingface worked together with @FetchRewards to take their document #AI solutions to production on @AWS . And guess what ? š āWith Yifengās guidance, Fetch was able to cut its development tā¦
HuggingFace ā· #general (491 messagesš„š„š„):
- GPU Pricing Queries:
@zorian_93363
discussed the cost comparison between certain GPUs and a specific 3090 model. They mentioned the possibility of acquiring 100 units for the price of a single 3090 in their location. - Increasing Model Performance through Custom Frameworks:
@ahmad3794
suggested that writing custom frameworks could unleash the potential of 4 teraflops on an 8-bit integrated circuit, offering considerable computing power. - Electronics DIY Enthusiasm:
@zorian_93363
expressed a desire to play with electronics and build computers but lamented the lack of time due to an economic crisis, while appreciating othersā skills and abilities to innovate despite challenges. - Iranās Resourcefulness Amidst Sanctions:
@ahmad3794
elaborated on building affordable clusters as a workaround for obtaining high-power technology, which is hard to get in Iran due to sanctions. - Accessing GPT Models and UI Challenges:
@welltoobado
and@caleb_sol
discussed the possibility and methods of using quantized versions of models for CPU inference without extensive RAM usage, with mentions of llama cpp as a beneficial tool.
Links mentioned:
- GroqChat: no description found
- Morph Studio: no description found
- Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique: no description found
- Hugging Face: Here at Hugging Face, weāre on a journey to advance and democratize ML for everyone. Along the way, we contribute to the development of technology for the better.
- 2869993 Hail GIF - 2869993 Hail - Discover & Share GIFs: Click to view the GIF
- Tweet from blob (@moanaris): no description found
- kopyl/ui-icons-256 Ā· Hugging Face: no description found
- Hugging Face ā The AI community building the future.: no description found
- Kermit Worried GIF - Kermit Worried Oh No - Discover & Share GIFs: Click to view the GIF
- Boom Explode GIF - Boom Explode Explosions - Discover & Share GIFs: Click to view the GIF
- Matrix Multiplication Background Userās Guide - NVIDIA Docs: no description found
- Hugging Face ā The AI community building the future.: no description found
- Gradio: no description found
- Tweet from Jason (@mytechceoo): ChatGPT wrappers when OpenAI is down..
- cahya/gpt2-small-indonesian-522M Ā· Hugging Face: no description found
- dpaste/15nGx (Python): no description found
- NCIS ridiculous hacking scene: one keyboard, two typists HD: no description found
- TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF at main: no description found
- The System Is Down- Strongbad: Wow.. Really didnāt think this video would be this popular. Apparently people come here when a server to a game is down. Ha! Epic.. Anyway, enjoy! Yes itās jā¦
- āHugging Face Outage Impact: Created with Gemini Advanced.
- The Website is Down #1: Sales Guy vs. Web Dude: The Website is Down: Sales Guy Vs. Web Dude High QualityThe original video in high resolution.This video won a Webby award!
- āHumanEvalā object has no attribute ādatasetā Ā· Issue #131 Ā· bigcode-project/bigcode-evaluation-harness: When I evaluate human eval with llama 7b, I met this problem: my script accelerate launch /cpfs01/shared/Group-m6/dongguanting.dgt/bigcode-evaluation-harness/main.py āmodel ā/path to my llama7b/ā¦
- Issues Ā· huggingface/api-inference-community: Contribute to huggingface/api-inference-community development by creating an account on GitHub.
- Workflow runs Ā· huggingface/text-embeddings-inference: A blazing fast inference solution for text embeddings models - Workflow runs Ā· huggingface/text-embeddings-inference
- GitHub - comfyanonymous/ComfyUI: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. - comfyanonymous/ComfyUI
- Issue with offline mode Ā· Issue #4760 Ā· huggingface/datasets: Describe the bug I canāt retrieve a cached dataset with offline mode enabled Steps to reproduce the bug To reproduce my issue, first, youāll need to run a script that will cache the dataset imā¦
- Issues Ā· huggingface/huggingface_hub: The official Python client for the Huggingface Hub. - Issues Ā· huggingface/huggingface_hub
- Build software better, together: GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.
- Add PatchModelAddDownscale (Kohya Deep Shrink) node. Ā· comfyanonymous/ComfyUI@bd07ad1: By adding a downscale to the unet in the first timesteps this node lets you generate images at higher resolutions with less consistency issues.
- Hugging Face status : no description found
HuggingFace ā· #today-im-learning (8 messagesš„):
- Exploring DSPy and OpenFunctions v2: User
@n278jm
is investigating DSPy, a framework for programming foundation models without prompting, and Gorilla OpenFunctions v2, an advanced open-source function calling system for LLMs. They aim to use these tools to improve their client on-boarding process, making the move from Gradio prototypes to production-ready versions. - Harness the Power of OpenAI and Hugging Face:
@davidre95
encourages users to utilize the tools from OpenAI Chat and Hugging Face chat room as resources. - Project Collaboration on Invoice Processing:
@pampkinparty000
invites users dealing with PDF or picture invoices to DM them for a potential collaboration on a project with similar goals. - Invoice Storage Advice for Greater Efficiency:
@pampkinparty000
recommends storing invoices in a vectorized database with metadata for more efficient use of LLMs, suggesting the use of libraries like llama-index. - Seeking a Research Community in AI:
@raghadn3
is in search of a community dedicated to writing research papers on Artificial Intelligence.
Links mentioned:
- GitHub - stanfordnlp/dspy: DSPy: The framework for programmingānot promptingāfoundation models: DSPy: The framework for programmingānot promptingāfoundation models - stanfordnlp/dspy
- Introduction to Gorilla LLM: no description found
HuggingFace ā· #cool-finds (9 messagesš„):
-
BitNet b1.58: Efficient LLMs:
@jessjess84
highlighted the potential of BitNet b1.58, a new 1-bit Large Language Model that promises efficiency without sacrificing performance, detailed in an arXiv paper. Achieving the same results as full-precision models, it introduces cost-effective latency, memory, throughput, and energy consumption. -
Stable Diffusion Deluxe Debuts:
@skquark
invited users to try Stable Diffusion Deluxe, an extensive multimedia AI toolkit supporting various AI art generators, boasting features for creating images, videos, sound effects, and more. The platform, detailed at diffusiondeluxe.com, integrates numerous pipelines and is designed for ease of use and creative experimentation. -
Looking for Self-Hosting Details: In response to
@skquark
ās all-in-one multimedia AI app,@wolfspyre
inquired about self-hosting options, complimenting the project as āsuper coolā and expressing interest in diving deeper. -
Appreciating āThe Hugā:
@evergreenking
shared a link to thehug.xyz, a site described as ājust link art,ā with@wolfspyre
following up to ask if it was@evergreenking
ās creation.
Links mentioned:
- HUG | A Home for Your Art: Join our global creative community to showcase & sell your art, connect with others, and access creator-friendly grants and education.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- Uncovering the Origins of Values: A Biology and Cognition-Based Approach for AI Alignment: no description found
- Diffusion Deluxe Home - Stable Diffusion Deluxe: no description found
HuggingFace ā· #i-made-this (14 messagesš„):
-
DIY Local LLM Assistant Unveiled:
@rivridis
developed a Locally running LLM Assistant with an assistant mode and real-time editing mode for content editing and creation. The code and details are available on GitHub. -
Deploy to Google Cloud Vertex AI Simplified:
@alvarobartt
wrote a blog post detailing how to deploy models from the HuggingFace Hub to Google Cloud Vertex AI. You can check out the technical post and its step-by-step guide here. -
Cursor Hero demo v0.3.0:
@teamy
is developing a UI tool titled Cursor Hero, with integrations of ollama and whisper. A demo of the tool can be found in this YouTube video. -
Gantrithor: A Data Annotation Leap:
@stroggoz
announced an open beta for Gantrithor, a rapid, bulk data annotation tool, with a free version limiting datasets to 1000 documents. Learn more and try it out at Gantrithor. -
Starcoder 2: Code & Learn:
@tonic_1
fixed errors in the example code and announced Starcoder 2, available for learning and enjoyment, with a call to collaborate on fine-tuning models. Find the project on HuggingFace Spaces.
Links mentioned:
- MetaMath Mistral Pro - a Hugging Face Space by Tonic: no description found
- Deploying š¤ Hub models in Vertex AI: no description found
- StarCoder2 - a Hugging Face Space by Tonic: no description found
- Qbeastās Adventure in AI-Driven Meme Creation - Qbeast: Learn about AI model selection, fine-tuning, and the role of Qbeast in enhancing meme creativity. Perfect for AI enthusiasts and data engineers seeking insights and innovation.
- Gantrithor: no description found
- Cursor Hero demo v0.3.0: https://github.com/TeamDman/Cursor-Hero.githttps://discord.gg/psHtde64FJ#rust #bevy #windows #win32
- this one rly slaps - episode 16 #music #producer: gonna be hard to beat this one
- GitHub - Rivridis/LLM-Assistant: Locally running LLM with internet access: Locally running LLM with internet access. Contribute to Rivridis/LLM-Assistant development by creating an account on GitHub.
- SDXL-Lightning: quick look and comparison: With SDXL-Lightning you can generate extremely high quality images using a single step.
HuggingFace ā· #diffusion-discussions (5 messages):
- Gradio Queue Function Clarification: User
@akin8941
inquired about the return type of thequeue()
function in gradio interface, and@iakhil
clarified that it does not have a return type of its own. - Too Fast for Comfort:
@HuggingMod
cautioned@1122120801903194114
about posting too quickly in the HuggingFace Discord, asking to slow down a bit with a friendly reminder emoji. - Scheduler Name Puzzle:
@luihis
expressed difficulty in retrieving the string name of a scheduler due to deprecation warnings. Despite attempts using different properties, the correct string, āDPMSolverSinglestepScheduler,ā remained elusive.
HuggingFace ā· #computer-vision (4 messages):
- Parseq Praise: User
@whoami02
recommended the use of Parseq for its effective symbol recognition capabilities. - Personalized Fine-tuning Success: They also mentioned successfully fine-tuning the model on their specific dataset, which contained images similar to the equations they needed to detect.
- Resnet Still Rocks: As for the task of detection,
@whoami02
asserted that Resnet stands strong and is good enough for their needs. - Slow Your Roll:
@HuggingMod
advised@whoami02
to slow down their message posting to adhere to the community guidelines.
HuggingFace ā· #NLP (14 messagesš„):
-
Inference Troubles in the Hugging Face Repo:
@alfred6549
sought assistance for running the text generation inference repository on a machine without a CPU or CUDA, sharing an error they encountered. Despite attempts to disable GPU usage, the local setup still failed. -
Petals Resonate with Users: User
@ai_noob
simply stated āpetalsā, which received a positive acknowledgment from@nrs9044
, indicating a shared sentiment or understanding about the termās context. -
Benchmark Necessities Discussed:
@vipitis
stressed the importance of testing on larger benchmarks for validity, while@djpanda1
acknowledged the advice but noted that preliminary tests on several prompts appeared successful. -
Financial Document Insight Quest:
@hiteshwarsingh1
is exploring ways to extract information from financial documents, considering MapReduce techniques and seeking recommendations for open-source models or approaches suitable for summarization rather than specific information retrieval. -
Improving Information Extraction with LLMs:
@.sgp
is utilizing mistral 7b with llamacpp for JSON data extraction and expressed interest in incorporating in-context learning to enhance accuracy, requesting resources on the topic.
Links mentioned:
- deepseek-ai/deepseek-coder-6.7b-instruct Ā· Hugging Face: no description found
- Hugging Face: The AI community building the future. Hugging Face has 196 repositories available. Follow their code on GitHub.
HuggingFace ā· #diffusion-discussions (5 messages):
- Gradioās
queue()
Function Clarification:@akin8941
asked about the return type of thequeue()
function in the Gradio interface, to which@iakhil
responded that it doesnāt have a return type of its own. - Slow Down Warning by HuggingMod: A reminder was given by
HuggingMod
directed at<@1122120801903194114>
, cautioning them to slow down their message frequency in the channel. - Trouble with Deprecation Notice:
@luihis
shared a snippet of code and expressed confusion due to a deprecation warning when trying to get the name of a scheduler as a string; emphasizes uncertainty even after different attempts at printing the schedulerās class name.
LAION ā· #general (314 messagesš„š„):
-
Ideogram Launch Causes Stir:
@pseudoterminalx
shared a prompt result from the new AI model by Ideogram, triggering discussions on its prompt adherence and aesthetics. There were comparisons to Stable Diffusion and speculations about the potential poor quality of unseen Imagen samples. -
T5 XXL, CLIP L, and CLIP G in SD3?:
@thejonasbrothers
and@devilismyfriend
discussed the integration of T5 XXL and CLIP models in SD3, hinting at the potential for both accuracy and appealing aesthetics in future models. -
Cascadeās Fidelity Questioned:
@pseudoterminalx
and others critically evaluated Cascadeās ability to generate images based on prompts, noting frequent issues with prompt adherence and specificity. -
AI Generated Art and Copyright Battles: Users
@progamergov
,@itali4no
, and others engaged in conversations about the looming legal challenges around AI-generated art, referencing recent cases and the ambivalent approach of Huggingface towards DMCA requests. -
Stability AIās Silent Many Projects:
@.undeleted
expressed confusion over the multiplicity of projects with similar goals at Stability AI, each announced similarly but with unclear differences.
Links mentioned:
- Release v0.9.1 - DoRA the explorah Ā· bghira/SimpleTuner: This release has some breaking changes for users who: Use RESOLUTION_TYPE=area (resolution_type=area for multidatabackend config) Use crop=false Use crop=true and crop_aspect=preserve as the precā¦
- panopstor/nvflickritw-cogvlm-captions Ā· Datasets at Hugging Face: no description found
- Willys Chocolate Experience Glasgow. Get your Tickets!: INDULGE IN A CHOCOLATE FANTASY LIKE NEVER BEFORE - CAPTURE THE ENCHANTMENT! Tickets to Willys Chocolate Experience are on sale now! at the willys chocolate experience in Glasgow! Tickets to Willys Chā¦
- China issues worldās 1st legally binding verdict on copyright infringement of AI-generated images - Global Times: no description found
- Copyright Safety for Generative AI | Published in Houston Law Review: By Matthew Sag. 61 Hous. L. Rev. 295 (2023)
LAION ā· #research (48 messagesš„):
-
Spiking Neural Network Speculations:
@max_voltage
wonders if advancements might lead to a reintroduction of spiking neural networks, proposing time dithering as a technique to enhance precision.@spirit_from_germany
agrees, reminded of spiking networks by the concept. -
Contemplating Low Information Density in Models:
@max_voltage
expresses surprise at the ability to lower information to 1-2 bits per weight in models, indicating a low info density in current models.@thejonasbrothers
explained this is possible due to the innate sparsity of existing networks, while some weights could be even 1-bit or 0-bit. -
New AI Image Generator Buzz:
@vrus0188
shares a Reddit post about a new AI image generator thatās reportedly 8 times faster than OpenAIās best tool and can run on modest computers.@spirit_from_germany
provides a link to the KOALA image generator site for quality testing without cherry-picking. -
EMO: Creating Expressive Portrait Videos: The EMO project is highlighted by
@helium__
, presenting a new audio-driven portrait-video generation method.@itali4no
remarks on the same authors as the animate anyone paper, indicating a likely absence of released code. -
AI Icon Generation Model Release:
@kopyl
announces the release of a state-of-the-art AI model for icon generation, trained with a personal investment of $2000, available via Hugging Face.@chad_in_the_house
praises the modelās low noise, although@kopyl
advises that it only generates images at 256px resolution. -
Language Model Distillation Learning Inquiry:
@jh0482
seeks information on distillation learning specifically for embedding language models, discussing concerns related to continuous space targets.@itali4no
suggests standard distillation methods might apply, but@jh0482
considers regression towards the target and contrastive learning as potential methods.
Links mentioned:
- KOALA: Self-Attention Matters in Knowledge Distillation of Latent Diffusion Models for Memory-Efficient and Fast Image Synthesis: SOCIAL MEDIA DESCRIPTION TAG TAG
- Elucidating the Design Space of Diffusion-Based Generative Models: We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates tā¦
- EMO: EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
- Samsung Develops Industry-First 36GB HBM3E 12H DRAM: Samsungās HBM3E 12H achieves industryās largest capacity HBM with groundbreaking 12-layer stack, raising both performance and capacity by more than 50% Advanced TC NCF technology enhances vertical deā¦
- Reddit - Dive into anything: no description found
- GitHub - collabora/WhisperSpeech: An Open Source text-to-speech system built by inverting Whisper.: An Open Source text-to-speech system built by inverting Whisper. - collabora/WhisperSpeech
- kopyl/ui-icons-256 Ā· Hugging Face: no description found
- UI icons - v1.0 | Stable Diffusion Checkpoint | Civitai: SOTA model for generating icons. Motivation: I spent $2000 of my own money to train this model. I was unable to monetize it, so Iām sharing it withā¦
Nous Research AI ā· #off-topic (21 messagesš„):
- Emoji Reacts Tell a Story:
@leontello
and@0xevil
employed emotive emojis, with the former using a salute emoji (<:o7:1151260455218708480>
) and the latter a skull emoji (<:dead:1072635189274083409>
), reflecting a sense of conclusion or death, followed by a crying face (<:f_cry:1159653986681499768>
) in response to the absence of GPT-5. - Anticipating Future GPT iterations: Conversation by
@0xevil
highlighted the communityās anticipation for future GPT versions, mentioning non-existent GPT-6 and responding humorously to@error.pdf
ās mention of GPT-9 with a surprised emoji (<:ooo:1133962720232865843>
). - Monitor and Dock Recommendations:
@denovich
shared a YouTube video reviewing Dellās new 5K monitor and suggested that Dell offers monitors that can connect to multiple machines simultaneously, while mentioning that their docking stations and a specific model, the Dell Thunderbolt Dock WD22TB4, are worth considering and can be found on eBay. - Anticipations on Y Combinatorās Batch Focus:
@0xevil
pondered whether Y Combinatorās latest batch predominantly featured companies offering GPT-wrapper services, observing similarities with existing products and innovations in areas like transcription and code generation from design. - Speculations and Shared Resources Surrounding GPT Patents and Applications:
@0xevil
mulled over the GPT-6 patent possibly discussed in broader circles and noted the integration of AI agents with music generation, while@pradeep1148
shared a YouTube video demonstrating how to fine-tune the Gemma model using Unsloth.
Links mentioned:
- Oppenheimer Oppenheimer Movie GIF - Oppenheimer Oppenheimer movie Oppenheimer explosions - Discover & Share GIFs: Click to view the GIF
- Finetune Gemma 7B with Unsloth: We will take a look at how to finetune Gemma model using unslothhttps://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing#scrollTā¦
- One Month with the Best Monitor in the World: The New Dell 40ā 5K120 HDR U4025QW: Dave spends a month with the brand new Dell 5K120 HDR monitor. For my book on life on the Spectrum: https://amzn.to/49sCbbJFollow me on Facebook at http://fā¦
Nous Research AI ā· #interesting-links (6 messages):
-
1-bit Revolution in LLMs:
@deki04
shared an arXiv paper introducing BitNet b1.58, a new 1-bit Large Language Model that achieves comparable performance to full-precision models while being more cost-effective. The model presents a ānew scaling lawā for designing high-performance, yet cost-efficient LLMs. -
Curiosity Piqued by BitNet:
@deki04
expressed surprise about the existence of 1-bit LLMs, not having encountered this concept before. -
Scaling Laws Under the Microscope:
@sherlockzoozoo
commented that multiplicative scaling laws are interesting, presumably in the context of the 1-bit LLM, and noted that additive scaling doesnāt perform well with increasing model size. -
New LLM Benchmark Released:
@tarruda
shared a link to Nicholas Carliniās benchmark for Large Language Models, highlighting its unique tests that include a range of complex tasks and the use of a dataflow domain specific language for easy test additions. -
Benchmark Results on Mistral vs GPT-4: Following the benchmark share,
@tarruda
mentioned a YouTube video where someone tested the benchmark on various models, including some 7B models like Mistral and GPT-4.
Links mentioned:
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- My benchmark for large language models : no description found
- Mistral Large vs GPT4 - Practical Benchmarking!: ā”ļø One-click Fine-tuning & Inference Templates: https://github.com/TrelisResearch/one-click-llms/ā”ļø Trelis Function-calling Models (incl. OpenChat 3.5): httpā¦
Nous Research AI ā· #general (205 messagesš„š„):
- Ragtag Ruminations on RAG:
@natefyi_30842
discussed the use of an LLM to create Q&A pairs that are then fine-tuned and combined with RAG for better context understanding. - Issues with Service Providers and Fine-Tuning:
@teknium
commented that fine-tuning providers are facing issues due to conflicts between fine-tune mixing and scaled inference code, making local GGUF setups the only reliable option currently. - Troubles with Gemini 2B Fine-Tuning:
@lmmint
asked the community if anyone was successful in fine-tuning the Gemini 2B and mentioned high-quality data as a requirement. - CausalLMās Impressive MMLU Score:
@nonameusr
expressed surprise at CausalLMās high MMLU benchmark and shared a link provided by@giftedgummybee
to the Hugging Face model CausalLM/34B-preview. - Excitement Around the Release of HyenaDNA: Discussions surrounding Stanfordās introduction of HyenaDNAālong-range genomic model with 1 million token capacityāgenerated buzz, with
@euclaise
suggesting āfill in the middleā (FIM) might be suitable for DNA sequences over autoregressive models.
Links mentioned:
- Tweet from undefined: no description found
- HyenaDNA: learning from DNA with 1 Million token context: HyenaDNA is a long genomic sequence model trained on the Human Reference Genome with context length of up to 1 million tokens.
- CausalLM/34B-preview Ā· Hugging Face: no description found
- qualcomm (Qualcomm): no description found
- Embedding - GPT4All Documentation: no description found
- OpenAI Five defeats Dota 2 world champions: OpenAI Five is the first AI to beat the world champions in an esports game, having won two back-to-back games versus the world champion Dota 2 team,Ā OG, atĀ FinalsĀ this weekend. Both OpenAI Five and Deā¦
- Tweet from TechCrunch (@TechCrunch): Tim Cook says Apple will ābreak new groundā in GenAI this year https://tcrn.ch/3Ig8TAX
- UniProt: no description found
- sordonia (Alessandro Sordoni): no description found
- supertrainer2000/supertrainer2k/optim/adalite.py at master Ā· euclaise/supertrainer2000: Contribute to euclaise/supertrainer2000 development by creating an account on GitHub.
- GitHub - nestordemeure/question_extractor: Generate question/answer training pairs out of raw text.: Generate question/answer training pairs out of raw text. - nestordemeure/question_extractor
- BAAI/bge-base-en-v1.5 Ā· Hugging Face: no description found
- Models: Remove system prompt of Nous-Hermes-2-Mistral-7b-DPO by ThiloteE Ā· Pull Request #2054 Ā· nomic-ai/gpt4all): Describe your changes Adds āaccepts various system promptsā Removes system prompt fix whitespace Checklist before requesting a review I have performed a self-review of my code. If it isā¦
- CausalLM/34b-beta Ā· Hugging Face: no description found
- Models: Remove system prompt of Nous-Hermes-2-Mistral-7b-DPO by ThiloteE Ā· Pull Request #2054 Ā· nomic-ai/gpt4all: Describe your changes Adds āaccepts various system promptsā Removes system prompt fix whitespace Checklist before requesting a review I have performed a self-review of my code. If it isā¦
Nous Research AI ā· #ask-about-llms (45 messagesš„):
-
Seeking GPT-4 level on a budget:
@natefyi_30842
sought a cheaper alternative to GPT-4 that can prevent the inclusion of provided subsequent book chunks in its responses, findingMixtral Instruct
to work fairly well despite its limitations. The conversation suggests that only GPT-4 behaves as desired in this context. -
Fine-tuning a question of quantity: Discussing the significance of the training dataset size,
@natefyi_30842
wondered if a hundred entries would suffice as opposed to millions, and@teknium
succinctly replied with ā5kā. -
DPO tactics in model training discussed: In pursuit of improving model answers,
@natefyi_30842
considered generating wrong examples for Directed Prompt Optimization (DPO), meanwhile, users discussed when DPO might be more effective. -
Choosing separators for text manipulation:
@natefyi_30842
pondered the efficacy of using standard or unique tokens as separators, such as emojis vs.%XYZ%
, for adding elements to text in model inputs;@natefyi_30842
shared a link to a tokenizer for context. -
Interpretability and engineering representations: Max_paperclips discussed the exciting field of representations engineering, citing a favorite post and referring to work such as Representation Engineering: A Top-Down Approach to AI Transparency and the corresponding Github code for the paper.
Links mentioned:
-
Bowing Thank You GIF - Bowing Thank You Tom And Jerry - Discover & Share GIFs: Click to view the GIF
-
[
Representation Engineering Mistral-7B an Acid Trip
](https://vgel.me/posts/representation-engineering/): no description found
-
Metas Llama 3 is set to release in July and could be twice the size: Metaās next open-source language model, Llama 3, is scheduled for release in July and is intended to be on par with GPT-4.
Nous Research AI ā· #project-obsidian (3 messages):
Hereās the summary based on the messages provided:
- QT Node-X Twitter Updates: QT Node-Xās Twitter shared a series of posts QT Node-X Tweet 1, QT Node-X Tweet 2, and QT Node-X Tweet 3, though the content of the tweets was not provided in the messages.
Latent Space ā· #ai-general-chat (57 messagesš„š„):
- Noam Shazeerās Blog Debut:
@swyxio
shared the first blog post by Noam Shazeer, discussing coding style, titled Shape Suffixes: Good Coding Style. - Customer Satisfaction and LLMs:
@eugeneyan
expressed appreciation for a data point indicating that LLMs are on par with humans in customer service satisfaction and can handle two-thirds of customer service queries. - Skepticism on AI News:
@swyxio
flagged an overhyped news piece, suggesting skepticism when something seems too good, referencing the Klarna AI assistant story on Fast Company. - Discussion on LLM Paper Club:
@swyxio
alerted users to a special Matryoshka Embeddings presentation, while@osanseviero
and@swyxio
referenced additional materials on this topic, including a blog post on HuggingFace and a YouTube channel with simplified LLM technique explanations. - Insights on Lakehouses and Data Engineering: In response to
@quicknick123
seeking resources on lakehouses,@swyxio
recommended an in-depth guide on table formats, query engines, and the utility of Spark published by Airbyte.
Links mentioned:
- no title found: no description found
- Tweet from Noam Shazeer (@NoamShazeer): https://medium.com/@NoamShazeer/shape-suffixes-good-coding-style-f836e72e24fd Check out my first blog post.
- Matryoshka Representation Learning: Learned representations are a central component in modern ML systems, serving a multitude of downstream tasks. When training such representations, it is often the case that computational and statisticā¦
- Tweet from murat š„ (@mayfer): wow, highly recommend checking out all the samples: https://humanaigc.github.io/emote-portrait-alive/ āļø Quoting AK (@_akhaliq) Alibaba presents EMO: Emote Portrait Alive Generating Expressive Porā¦
- Conviction : no description found
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- Efficient NLP: Efficient NLP Consulting My name is Bai Li, Iām a machine learning engineer and PhD in natural language processing. I can help you build cost-effective and efficient NLP systems. Reach me at: Emā¦
- Data Lake / Lakehouse Guide: Powered by Data Lake Table Formats (Delta Lake, Iceberg, Hudi) | Airbyte: Explains the open-source data lakes and their power with data lake table formats. Whatās the difference between a lakehouse and when you need one.
- Tweet from Hamel Husain (@HamelHusain): Something smells really wrong about the Klarna news itās a bit too much made for TV? https://www.fastcompany.com/91039401/klarna-ai-virtual-assistant-does-the-work-of-700-humans-after-layoffs
- Tweet from Rowan Cheung (@rowancheung): Itās been a huge day for AI with announcements from Alibaba, Lightricks, Ideogram, Apple, Adobe, OpenAI, and more. The 7 most important developments that happened: 1. Alibaba researchers unveileā¦
- šŖ Introduction to Matryoshka Embedding Models: no description found
- Jonathan Ross at Web Summit Qatar: Groq CEO & Founder, Jonathan Ross, on Center Stage at #WebSummitQatar2024, discussing how to make AI Real.X (fka Twitter): @WebSummitQatarInstagram: @WebSummā¦
Latent Space ā· #ai-announcements (3 messages):
- Replicate CEO in the Podcast Spotlight:
@swyxio
announced the release of a new podcast episode featuring the CEO of Replicate. The tweet with the link to the episode can be found here. - MRL Embeddings Paper Club Meeting:
@swyxio
gave a heads-up about an upcoming event led by<@206404469263433728>
in the#1107320650961518663
channel, where the authors of the MRL embeddings paper will be present. The event cover can be viewed here. - Deep Dive into Representation Engineering:
@ivanleomk
flagged an upcoming session with<@796917146000424970>
on Representation Engineering 101 in the#1107320650961518663
channel, inviting members to participate and engage with questions.
Links mentioned:
LLM Paper Club (West Edition!) Ā· Luma: This week weāll be covering the paper - Matryoshka Representation Learning ( https://arxiv.org/abs/2205.13147 ) with two of the co-authors Gantavya Bhatt and Aniket Rege. We have movedā¦
Latent Space ā· #llm-paper-club-west (165 messagesš„š„):
-
Matryoshka Dolls Embrace AI: User
@akusupati
shared the paper titled āMatryoshka Representation Learningā and discussed its potential for creating LLM embeddings with adaptive dimensions. Itās a technique that could offer varying levels of abstraction, potentially saving on compute and storage. -
Making sense of MRL:
@swyxio
and others engaged in a discussion trying to grasp the quirks of Matryoshka Representation Learning (MRL), including insightful comparisons to PCA on embeddings and how this technique involves adding the loss of models at varying dimensions for optimized learning. -
Deployment Insights and Applications: Participants like
@ivanleomk
and@gulo0001
offered practical information and demonstrations of embedding models incorporating MRL. They discussed adaptations and provided resources like a Supabase blog and HuggingFace blog that help understand the real-world use of these models. -
Curiosity Reigns in Matryoshka Exploration:
@punnicat
, presumably one of the authors, was present to field questions and clarify concepts around Matryoshka Embeddings, especially concerning dimensionality and the granularity of embeddings during training and their implications for models. -
Engagement with Authors and Resources: The session marked a presence of curious minds asking questions about Matryoshka Embeddings and the broader implications for transformer models with users like
@swyxio
and@cakecrusher
discussing potential applications and improvements. The authors were open to sharing slides and further details like@punnicat
who can be contacted on Twitter.
Links mentioned:
-
Matryoshka Representation Learning (MRL) from the Ground Up | Aniket Rege: no description found
-
Nextra: the next docs builder: Nextra: the next docs builder
-
MatFormer: Nested Transformer for Elastic Inference: Transformer models are deployed in a wide range of settings, from multi-accelerator clusters to standalone mobile phones. The diverse inference constraints in these scenarios necessitate practitionersā¦
-
[
Representation Engineering Mistral-7B an Acid Trip
](https://vgel.me/posts/representation-engineering/#How_do_we_make_one?_Is_it_hard?ā): no description found
-
Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval: Use Adaptive Retrieval to improve query performance with OpenAIās new embedding models
-
Matrioska Loop GIF - Matrioska Loop Bored - Discover & Share GIFs: Click to view the GIF
-
AdANNS: A Framework for Adaptive Semantic Search: Web-scale search systems learn an encoder to embed a given query which is then hooked into an approximate nearest neighbor search (ANNS) pipeline to retrieve similar data points. To accurately captureā¦
-
šŖ Introduction to Matryoshka Embedding Models: no description found
-
NeuML/pubmedbert-base-embeddings-matryoshka Ā· Hugging Face: no description found
-
Representation Engineering 101: no description found
Perplexity AI ā· #general (157 messagesš„š„):
-
Activation Woes for Rabbit R1 Promo: User
@mithrilman
required assistance activating the Rabbit R1 promo.@icelavaman
provided step-by-step instructions, emphasizing the need to use the email link, and suggested contacting support for further help, especially since the email button appeared bugged and non-clickable. -
Podcast Curiosities and Clarity:
@_paradroid
raised a question about podcasts posting under the name āPerplexity AI,ā prompting@icelavaman
to clarify the official podcast link while@ok.alex
stated that unauthorized use of the Perplexity AI name is likely for attention or money. -
Understanding AI Model Preferences: New user
@outrerim
asked about strengths and weaknesses of different AI models, and@jaicraft
outlined core use-cases for Experimental, GPT-4 Turbo, Claude, and Mistral models, though opinions differed with users like.claidler
andnaivecoder786
favoring Mistral for code queries. -
Discussing Perplexityās Capabilities and Limitations:
@brknclock1215
described Perplexityās AI as excellent for internet-based information handling and answering questions rapidly, but highlighted its limitations such as parsing large files and image generation, understanding itās less optimized for such tasks. -
Concerns and Solutions for Perplexity Service Issues: Users
@stevvie
and@dv8s
encountered confusions regarding the absence of file upload options and name changes from āCopilotā to āPro,ā while@moyaoasis
suggested the addition of a feature for exporting Perplexity thread responses, a function not yet available but considered for future implementation.
Links mentioned:
- Tweet from Perplexity (@perplexity_ai): More on Mistral Large šhttps://www.perplexity.ai/search/Mistral-Large-Overview-Fw.QrWxvR9e9NRuDxB1wzQ
- āDiscover Daily by Perplexity on Apple Podcasts: āNews Ā· 2024
- āPerplexity AI on Apple Podcasts: āNews Ā· 2024
- āStuff You Should Know About AI on Apple Podcasts: āBusiness Ā· 2024
Perplexity AI ā· #sharing (13 messagesš„):
- Librem5 Explores BurpSuite Community Edition:
@librem5
shared a Perplexity link examining the differences between BurpSuite Community Edition and an unspecified alternative. - Muscle Building Plan crafted by AI:
@commuting5048
requested a muscle-building plan optimized with a focus on protecting arms from over-fatigue, and shared the resulting Perplexity search. They expressed satisfaction with GPT-4ās detailed workout including sets and reps. - Ourdigital Investigates Digital Analytics with Perplexity:
@ourdigital
utilized Perplexity to gather and organize information for digital analytics and performance marketing, sharing his findings in a Perplexity link. - Exploring Mistralās Capabilities: Several users, including
@manbearpig86
,@rhysd21
, and@dailyfocus_daily
, were looking into comparisons between Mistral and other models like ChatGPT, as reflected in their shared Perplexity search links, another comparison, and a Starcoder announcement. - Podcast Prompt Crafting and AI Future Discussions:
@_paradroid
shared a Perplexity link for crafting a podcast prompt for ā48 Hours of AIā and another link discussing Russiaās preparation for future challenges, likely with AI, using a ResearchGPT prompt (ResearchGPT prompt link).
Perplexity AI ā· #pplx-api (28 messagesš„):
-
Glitch Hunt in Text Generation:
@thedigitalcat
pointed out that glitches often occur when the system attempts to generate source information during text production. Other users like@brknclock1215
and@clay_ferguson
contributed to the discussion, suggesting that the issue could relate to the implementation of sources and the inference layerās approach. -
Sonnar Mediumās Weather Query Passion:
@brknclock1215
humorously continued to test sonar-medium-online with weather-related queries, reporting inconsistent behaviors related to the retrieval system and making observations about the presence of āresponsiveā elements in system messages. -
The Nostalgia for pplx-70b: Amidst discussions on model performance,
@thedigitalcat
humorously suggested that everyone will eventually agree that pplx-70b was superior to sonar models, with@lazysucker
expressing agreement. -
The API Conundrum:
@jeffworthington
encountered an error when using an OpenAPI definition from the provided documentation and queried whether a newer version should be referenced, indicating potential issues with the existing API definitions. -
Seeking Perplexityās API for Voice Chat:
@tom_primozic
inquired about using Perplexity AIās functionality through an API for a voice chat application, noting discrepancies in response quality between the website andsonar-medium-online
model.
Links mentioned:
Getting Started with pplx-api: You can access pplx-api using HTTPS requests. Authenticating involves the following steps:Start by visiting the Perplexity API Settings page. Register your credit card to get started. This step will nā¦
Eleuther ā· #announcements (1 messages):
- Launch of Foundation Model Development Cheatsheet:
@hailey_schoelkopf
announced the release of The Foundation Model Development Cheatsheet, a resource to assist new open model developers. The cheatsheet was a collaborative effort featuring contributors from EleutherAI, MIT, AI2, Hugging Face, and other institutions, aiming to provide an overview of resources for responsible open model development. - The Cheatsheet Champions Open Model Pioneers: Highlighting the importance of open model development,
@hailey_schoelkopf
pointed out the release of fully transparent models such as the Pythia model suite by EleutherAI, Amber by the LLM360 project, and AI2ās OLMo, emphasizing the growth of openly available models since April 2023. - Focus on Dataset Documentation and Licensing: The new resource focuses on important and underdiscussed areas in model development like dataset documentation and licensing practices, which are crucial for creating open models.
- Where to Find the Cheatsheet: The Foundation Model Development Cheatsheet can be accessed as a PDF paper or viewed as an interactive website. Updates and additional context are available in their blog post and Twitter thread.
Eleuther ā· #general (34 messagesš„):
-
Seeking Cross-Attention SSM Model:
@_michaelsh
inquired about models with cross-attention similar to BERT for sequence classification;@stellaathena
suggested models could be trained as encoders and later mentioned StripedHyena, which alternates attention and SSM layers.@frazermc
favoredadaLN0
withmamba
, and although there wasnāt a pretrained mamba for sequence classification readily available, it was suggested that one could train a classification head on an existing checkpoint. -
Stable Video Diffusion Inquiry:
@clashluke
was looking for guidance on how to train/fine-tune the stable video diffusion model, looking to retain its v-prediction while noting it usesEulerDiscrete
without aget_velocity
function for training. -
Understanding lm-evaluation-harness: Several users, including
@slowturtle_p
,@hailey_schoelkopf
, and@maya_liv
, discussed nuances of the lm-evaluation-harness evaluation tool, including score normalization, model substitution with custom code, and potential TensorRT support.@stellaathena
provided a link to a blog post for further clarification on multiple-choice normalization. -
EleutherAI Pythia Model Status: Question from
@mistobaan
about the status of EleutherAI/pythia-13m model, to which@catboy_slim_
clarified it is still available if referring to the 14m variant. -
Various Discussion and Announcements: Users like
@canadagoose1
shared logistical challenges and announcements about talks,@gaindrew
highlighted an abstract of a research paper introducing a 1-bit Large Language Model,@tastybucketofrice
and@hailey_schoelkopf
celebrated user engagement with specific datasets, and@ilovescience
noted automated downloads likely from usinglm-eval-harness
.
Links mentioned:
- Multiple Choice Normalization in LM Evaluation: There are multiple ways of evaluating multiple choice tasks on autoregressive LMs like GPT-3/Neo/J. This post lays out the current prevalent normalization methods.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- Oogway Master Oogway GIF - Oogway Master Oogway Kung Fu Panda - Discover & Share GIFs: Click to view the GIF
- Meet: Real-time meetings by Google. Using your browser, share your video, desktop, and presentations with teammates and customers.
- Issues Ā· EleutherAI/lm-evaluation-harness): A framework for few-shot evaluation of language models. - Issues Ā· EleutherAI/lm-evaluation-harness
Eleuther ā· #research (63 messagesš„š„):
-
Open Source Models Galore: @maxmatical shared a Twitter link to some open-sourced models with accompanying data, posting a tweet from BigCodeProject.
-
Pretraining Token Queries: In a discussion initiated by @leegao_ about the pretraining token-to-model size ratio, @stellaathena clarified, āThere are no rules,ā regarding the expectations of tokens for pretraining models. @maxmatical provided a link to a paper on arXiv discussing pretraining with constrained data.
-
Navigating Mazes with Diffusion Models: @.the_alt_man highlighted a diffusion model trained to solve mazes, sharing tweets from @francoisfleuret and @ArnaudPannatier. @uwu1468548483828484 also chimed in, relating it to prior work on solving mazes with variable depth neural networks.
-
Prompt Engineering Transferability Discourse: @thatspysaspy asked if thereās been study on prompt engineering transfer from small to big models; @catboy_slim_ replied with personal experiences, noting that while generic engineering transfers reasonably well, complex instructions tend to be tightly coupled with specific models. A systematic study with statistical measures seems to be an untapped area.
-
The Challenges of Sub 8 Bit Quantization: A series of messages from @kd90138 and @clock.work_ expressed skepticism about the practicality and scaling potential of 1-bit Large Language Models given current hardware trends and geopolitical concerns impacting chip manufacturing.
Links mentioned:
- Stable LM 2 1.6B Technical Report: We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instrucā¦
- Language Modeling by Estimating the Ratios of the Data Distribution | Aaron Lou: no description found
- Scaling Data-Constrained Language Models: The current trend of scaling language models involves increasing both parameter count and training dataset size. Extrapolating this trend suggests that training dataset size may soon be limited by theā¦
- LeoLM: Igniting German-Language LLM Research | LAION: <p>We proudly introduce LeoLM (<strong>L</strong>inguistically <strong>E</strong>nhanced <strong>O</strong>pen <strong>L</strong>anguage <stronā¦
- Tweet from FranƧois Fleuret (@francoisfleuret): We train a discrete diffusion denoising model to find paths in a maze. The visualization of the evolution of x_0|x_t (last message in the thread) is very cool IMO. āļø Quoting Arnaud Pannatier (@Arnauā¦
Eleuther ā· #scaling-laws (3 messages):
- Inquiring About Animation Creation:
@.the_alt_man
asked how a certain animation was made, expressing curiosity about the method or tool used. imageio
for GIFs: In response,@kyo_takano
mentioned thatimageio
was used to create the GIF animation.@.the_alt_man
followed up for confirmation to clarify that the animation was indeed created withimageio
.
Eleuther ā· #interpretability-general (15 messagesš„):
- Matrix Norms and Products Simplified:
@wendlerc
explained that matrix vector & matrix matrix products, as well as matrix norms, are shorthand for computing and summing up important cosines. The matrix-2-norm is specifically the matrix norm associated with the vector 2-norm. - Decoding Details in RMSNorm Implementation:
@wendlerc
clarified a subtle detail that their paper does not explicitly mention: the final decoding step involves an RMSNorm layer application toh
before matrix multiplication. They described a computational split of this process for ease in cosine calculations between resulting expressions. - Unpacking the Tuned Lens Decoding Process:
@wendlerc
and@mrgonao
discussed the mechanism of decoding using a tuned lens in neural networks. They considered whetherlogits = U RMSNormlayer(tunedlens(h))
accurately represents the tuned lensās activity. - Implementation Nuances of Tuned Lens and Notation: Throughout the conversation,
@wendlerc
addressed the practical aspects of porting their implementation to consider the tuned lensās effect, highlighting the necessity of substitutingh
withtunedlens(h)
. - Understanding Matrix Norm Terminology:
@norabelrose
clarified the terminology around matrix norms, stating that the Frobenius norm relates to the Euclidean norm of the matrix when flattened, whereas the ā2-normā of a matrix refers to its spectral norm or top singular value.
Eleuther ā· #lm-thunderdome (19 messagesš„):
-
Tinkering with LM Eval Harness:
@paganpegasus
inquired about integrating instruction/chat formatting into the LM Eval harness or considering finetuning on examples with existing eval harness formatting. -
Custom Model Modification for Hallucination Leaderboard:
@pminervini
shared a snippet of code from their approach to incorporate chat templates into the LM Eval harness for the hallucinations leaderboard, by extending theHFLM
class. -
Awaiting Progress on Proposed Modifications:
@asuglia
updated@981242445696221224
on the status of modifications being identified for a project, noting other tasks had taken precedence. -
Improving Multilingual Lambada Translations:
@hailey_schoelkopf
mentioned that@946388490579484732
contributed new, higher-quality translations to replace poor quality ones, and the changes will be integrated into the eval harness. The updated dataset includes additional languages, and is available on Hugging Face. -
Implementing EQ-Bench:
@pbevan1
sought advice on implementing EQ-Bench, a benchmark for emotional intelligence in language models, especially tasks that handle multiple answers for a single prompt.@hailey_schoelkopf
pointed to the Truthfulqa_mc2 task as an example.
Links mentioned:
- src/backend/huggingface_generate_until.py Ā· hallucinations-leaderboard/leaderboard at main: no description found
- GitHub - EQ-bench/EQ-Bench: A benchmark for emotional intelligence in large language models: A benchmark for emotional intelligence in large language models - EQ-bench/EQ-Bench
- marcob/lambada_multilingual Ā· Datasets at Hugging Face: no description found
Eleuther ā· #multimodal-general (2 messages):
- Choosing Between Encoder-Decoder and Decoder-Only Models: User
@jerry0478
inquired about when to use cross-attention conditioning as seen in encoder-decoder models compared to embedding tokens in input for decoder-only models. - Flamingo vs. LLaMA Architecture Decisions:
@jerry0478
contrasted āllama-styleā architectures with āflamingo-styleā ones, probing the community on intuition for optimal application scenarios of each.
Eleuther ā· #gpt-neox-dev (2 messages):
- Inquiring about Neox and Slurm:
@muwnd
asked for the recommended method to run Neox with Slurm and Containers, suspecting that--launcher_args
might be the way but noted it seems unavailable in Neox. - Tip on Neox Infrastructure:
@triggerhappygandhi
clarified that Neox does not assume any specifics about the infrastructure, and containers need to be set up in advance. A slurm script exists for using Slurm to run Neox on multinode.
LangChain AI ā· #general (89 messagesš„š„):
-
Seeking Confidence Score Insight: User
@ritanshoo
inquired about checking the confidence score when using LangChain.js for RAG. Kapa.ai did not have an immediate answer but referred to the LangChain documentation (https://js.langchain.com/docs/get_started) for further exploration. -
Contemplating Memory Integration with LCEL: Both
@marknicholas
and@pcube__
discussed different aspects of LangChain usage.@marknicholas
wanted to add memory to LCEL, and@pcube__
inquired about which language integrates best with LangChain for a server using azure hosted LLM as an API endpoint. Kapa.ai suggested consulting official documentation or reaching out to the community for specific guidance. -
Handling Tool Exceptions in Custom Applications:
@abinandan
requested a way to retry a tool ifToolException
is thrown when using a custom tool. Kapa.ai highlighted workarounds from LangChainās GitHub discussions and encouraged checking LangChainās GitHub issues for more streamlined solutions (https://github.com/langchain-ai/langchain/issues/10714). -
Using Shopify as an Automated Agent/Tool: User
@erikk4
sought automation solutions for customer support tasks related to Shopify, such as checking order statuses or canceling orders. They considered āfront deskā agents routing issues to specific tools and queried the community for tools beyond LangChain that might facilitate this process. -
Deployment Issues and Adding Functionality with LangChain: Users conveyed challenges with LangChainās deployment and functionality.
@hanumantgarad_25732
experienced anAttributeError
when usingSQLDatabase.from_databricks
outside a Databricks notebook.@kamakshi08
asked about using the JSON parser with LLaMA from Ollama, wondering how it integrates with multimodal models.
Links mentioned:
- no title found): no description found
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Revolutionizing AI Interactions: Integrating Function Calling with Mistral: Introduction
- Querying a SQL DB | š¦ļøš Langchain: We can replicate our SQLDatabaseChain with Runnables.
- JSON parser | š¦ļøš Langchain: This output parser allows users to specify an arbitrary JSON schema and
- Docusaurus | š¦ļøš Langchain): Docusaurus is a static-site generator which
- Custom Agent Class fails with object has no attribute āis_single_inputā Ā· Issue #18292 Ā· langchain-ai/langchain: Checked other resources I added a very descriptive title to this issue. I searched the LangChain documentation with the integrated search. I used the GitHub search to find a similar question and diā¦
- Groq: Insanely Fast Inference š | Worldās First Language Processing Unit (LPU): In this video, I will explain about Groq who introduced Worldās first Language Processing Unit (LPU) designed for AI applications (LLMs). I will show you howā¦
- Deployment | š¦ļøš Langchain).): In todayās fast-paced technological landscape, the use of Large Language Models (LLMs) is rapidly expanding. As a result, it is crucial for developers to understand how to effectively deploy thesā¦
- langchainjs/langchain/src/retrievers/score_threshold.ts at e24d2dedbe7ff93db33a5809e604143d60113028 Ā· langchain-ai/langchainjs): š¦š Build context-aware reasoning applications š¦š. Contribute to langchain-ai/langchainjs development by creating an account on GitHub.
- Issues Ā· langchain-ai/langchain.): š¦š Build context-aware reasoning applications. Contribute to langchain-ai/langchain development by creating an account on GitHub.
- Issues Ā· langchain-ai/langchain)): š¦š Build context-aware reasoning applications. Contribute to langchain-ai/langchain development by creating an account on GitHub.
- GenAI Summit San Francisco 2024: This summit is an extraordinary convergence of the brightest minds in Generative AI, encapsulating the spirit of the future. #AI_ARE_ALL
LangChain AI ā· #langserve (3 messages):
- LangServe Agent Troubles:
@thatdc
reported an issue where their agent is not returning the intermediate steps of execution when using langserve; however, it works fine when invoking directly from the agent class. They deduced the problem might be with the API server setup by langserve. - Deep Dive into the Tech Snag:
@thatdc
believes to have found the problem in theRemoteRunnable
object where the_decode_response
method seems to lose the intermediate steps by executingserializer.loadd(obj["output"])
. Theyāre in search of a workaround for this issue.
LangChain AI ā· #langchain-templates (2 messages):
- Invitation to Join the Discord Party:
@davisson0429
posted a Discord invite link for users to join, accompanied by a lengthy series of separator characters. - Seeking Python Template Wisdom:
@tigermusk
inquired about generating a template in Python code that resembles the one found at Smith LangChain Chat JSON Hub.
Links mentioned:
- LangSmith: no description found
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
LangChain AI ā· #share-your-work (4 messages):
-
āLangChain in your Pocketā Hits the Shelves: User
@mehulgupta7991
celebrated the listing of their debut book āLangChain in your Pocketā under Googleās Best books on LangChain. -
Flood of Discord Invites:
@davisson0429
shared an invite link to a Discord server with a string of obscured characters following the URL, and an @everyone tag, possibly indicating a call to join. -
Calling All Learners: User
@silvermango9927
shared a Google Form link soliciting feedback on interest in various topics such as Machine Learning, Data Science, and Web Development, as part of a validation process for a project they are considering. -
Voices of the Future:
@beaudjango
introduced āPablo,ā an AI Voice Chat app that supports multiple LLMs and voices without the need for typing, inviting beta testers to join with an offer for free AI credits. They mentioned looking for engineers willing to join their team using LangChain.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Join the Pablo - AI Voice Chat beta: Available on iOS
- Product Idea Validation Form: Hi, thank you so much for filling in this form and giving a response. The idea : Creating a lab (course) that teaches in a project-based manner compared to all of the conventional longer video-heavyā¦
LangChain AI ā· #tutorials (4 messages):
- Question on LangGraph Capabilities: User
@tigermusk
inquired whetherworkflow.compile()
is a runnable object in LangGraph. - Spam Alert:
@davisson0429
posted an unrelated and spammy invite link to an external Discord server filled with severe text repetition. - Groqās LPU Breakthrough Showcased:
@datasciencebasics
shared a YouTube video titled āGroq: Insanely Fast Inference š | Worldās First Language Processing Unit (LPU)ā highlighting the introduction of the worldās first Language Processing Unit designed for AI applications, showcasing its potential for LLMs. - LangGraph + YahooFinance Tutorial:
@tarikkaoutar
provided a video guide explaining how to create an AI stock analysis chatbot using LangGraph, Function call, and YahooFinance, enhancing understanding of multi-agent applications.
Links mentioned:
- Join the ONE PERCENT CLUB Discord Server!: Check out the ONE PERCENT CLUB community on Discord - hang out with 16193 other members and enjoy free voice and text chat.
- LangGraph + Function Call+ YahooFinance = Multi-Agent Application: #chatbot #animation #trading #ai #machinelearning #datascience In this video, you will make an AI stock analysis chatbot with LangGraph, Function call and Cā¦
- Groq: Insanely Fast Inference š | Worldās First Language Processing Unit (LPU): In this video, I will explain about Groq who introduced Worldās first Language Processing Unit (LPU) designed for AI applications (LLMs). I will show you howā¦
OpenAccess AI Collective (axolotl) ā· #general (44 messagesš„):
-
Trouble in Jupyter Town:
@nruaif
shared a log indicating issues with Jupyter notebooks, showing error messages related to extensions being linked and a Bad config encountered during initialization.@nanobitz
chimed in asking if it was a template or Jupyter issue. -
BitNet b1.58 Makes Waves:
@_dampf
shared an arXiv paper on BitNet b1.58, a 1-bit LLM that promises significant cost-efficiency with performance matching full-precision models.@nanobitz
mentioned itās not just a quantization method but a new architecture. -
Axolotl User Survey Outreach:
@caseus_
is seeking feedback through a questionnaire to improve understanding of axolotl users.@dreamgen
suggested making the form more concise to get more responses. -
Mistral Office Hours Announcement:
@casper_ai
shared an invite to the next Mistral AI office hour. -
Alpaca Formatting for Inferences:
@j_sp_r
inquired about formatting inferences to match the training instruction format, and@caseus_
responded that specifyingchat_template: alpaca
in the axolotl YAML will handle it.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single paramā¦
- TinyBox packs a punch with six of AMDās fastest gaming GPUs repurposed for AI ā new box uses Radeon 7900 XTX and retails for $15K, now in production: Startup wants to offer high AI performance using Radeon RX 7900 XTX.
- Reddit - Dive into anything: no description found
- Axolotl End User Questionnaire: no description found
OpenAccess AI Collective (axolotl) ā· #axolotl-dev (9 messagesš„):
- KTO Trainer Implementation Inquiry:
@giftedgummybee
shared a link to Huggingfaceās documentation on the Kahneman-Tversky Optimization (KTO) Trainer and asked@257999024458563585
if there are any plans to implement it.@caseus_
responded affirmatively, suggesting they might work on it the following week unless someone else takes it up earlier. - Sophia: A Speedy Optimizer:
@casper_ai
discussed the potential of Sophia optimizer being twice as fast as Adam algorithms and supplied the implementation link (not torch) for Sophia, highlighting its advantage in efficiency over traditional optimization methods. - Innovative Training with DropBP:
@suikamelon
brought up a study on Dropping Backward Propagation (DropBP), which reduces computational costs of neural network training while preserving accuracy by dropping layers during backward propagation. - Starcoder2 Training Support:
@faldore
inquired about support for Starcoder2, providing a link to its GitHub repository.
Links mentioned:
- DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation: Training deep neural networks typically involves substantial computational costs during both forward and backward propagation. The conventional layer dropping techniques drop certain layers during traā¦
- KTO Trainer: no description found
- Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training: Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variantā¦
- levanter/src/levanter/optim/sophia.py at main Ā· stanford-crfm/levanter: Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax - stanford-crfm/levanter
- GitHub - bigcode-project/starcoder2: Home of StarCoder2!: Home of StarCoder2! Contribute to bigcode-project/starcoder2 development by creating an account on GitHub.
OpenAccess AI Collective (axolotl) ā· #general-help (22 messagesš„):
- Pondering Plausible Intentions:
@nafnlaus00
floated the idea of prompting a sophisticated language model to generate intentionally wrong answers that seem plausible but contain flaws leading to incorrect conclusions, though no further discussion ensued. - Tool Swap Troubles:
@stoicbatman
contemplated switching from Runpod to Vast AI due to cost concerns and sought the communityās experience comparison;@nanobitz
responded noting that although cheaper, Vast AI doesnāt abstract machine details and offers variable machine quality. - Confusing Commit Conundrums:
@karisna
expressed disappointment that their commit to rewrite documentation for axolotl wasnāt accepted and pointed out a possible oversight where WSL2 setup for Windows isnāt sufficiently emphasized; however,@nanobitz
replied looking to clarify if the documentation issue had been addressed. - Benchmarks for the Brainy:
@jovial_lynx_74856
inquired about running benchmarks on a model finetuned with Axolotl, and@nanobitz
suggested looking at lm_eval_harness on Github, affirming thereās no direct integration for benchmarking within Axolotl itself. - Save Setting Snafu: Concerned about a saving discrepancy,
@duke001.
asked why settingsaves_per_epoch
to 4 andnum_epochs
to 4 resulted in only 4 checkpoints instead of the expected 16;@nanobitz
hinted at a resolution suggesting an adjustment to the save limit.
Links mentioned:
axolotl/src/axolotl/core/trainer_builder.py at 6b3b271925b2b0f0c98a33cebdc90788e31ffc29 Ā· OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
OpenAccess AI Collective (axolotl) ā· #community-showcase (11 messagesš„):
- Mistral Model Rivals ChatGPT 3.5:
@le_mess
shared that their 7B Mistral model matches the performance of ChatGPT 3.5 for Danish tasks. - Performance Strengthens Through Iterative Training:
@le_mess
improved their models by using a synthetic data approach and training over 30 iterations, enhancing responses over time without relying on GPT-4. - Initial Human Curation Leads to Scalable Model Training:
@le_mess
curated the first 1000 responses manually, then employed models to generate more data. Subsequent models were trained to identify high-quality responses for further training cycles.
LlamaIndex ā· #blog (4 messages):
-
Groq Accelerates LlamaIndex: The
@GroqInc
LPU now officially integrates with LlamaIndex and supportsllama2
andMixtral
models for efficient LLM generation. They announced this development with a cookbook guide, for streamlining application workflows. -
LlamaParse Sees Soaring Usage:
@llama_index
reports significant usage of LlamaParse, leading to important updates, such as working towards uncapped self-serve usage and temporarily increasing the usage cap from 1k pages. Details can be found at this update link. -
Optimizing Hybrid Search with LLMs: A new strategy for better retrieval in hybrid search uses LLMs to categorize queries with few-shot examples and subsequently adjust the alpha parameter.
@llama_index
shares insights into this approach in their latest tweet. -
RAG for Structured and Unstructured Data:
@llama_index
introduced a blog post by@ClickHouseDB
showcasing a RAG architecture suited for queries involving both unstructured and structured data, housed in the same database. Interested readers can delve into this integration here.
LlamaIndex ā· #general (75 messagesš„š„):
-
Exploring LlamaIndex Documentation Indexing:
@vaguely_happy
proposed setting up a service to index the latest LlamaIndex docs, which prompted@cheesyfishes
to mention mendable on docs and@whitefang_jr
informing about LlamaParse not currently sending page numbers but work is in progress to add page numbers and labels. -
Clarification on Callbacks in Golang: As
@sansmoraxz
questioned the use ofCallbackHandler
with native types,@cheesyfishes
assured a refactor is in progress for callbacks and advised holding off on concerns for the moment due to expected improvements. -
Debating Reranker Models: In a discussion initiated by
@richard1861
regarding the superior reranking model between Colbert and Cohere,@.sysfor
shared code and suggested using both the FlagEmbeddingReranker and CohereReranker together, despite having no formal metrics to compare their performance. -
Visualizing ReActAgent Pipelines/DAGs:
@mrpurple9389
inquired about visualizing the graph for ReActAgent, and while@cheesyfishes
clarified that ReActAgent lacks a visual graph,@mrpurple9389
further explored visualizing the agent if replicated using pipelines/DAGs. -
Discussions on LlamaIndex vs. Langchain and Compatibility:
@tr1ckydev
sought clarification on the differences between LlamaIndex and Langchain, with@cheesyfishes
explaining that LlamaIndex focuses on connecting data to LLMs while Langchain is more of a comprehensive library. Follow-up queries included compatibility inquiries, indicating that LlamaIndex can be integrated with various vector databases and LLM platforms.
Links mentioned:
- Introducing LlamaCloud and LlamaParse ā LlamaIndex, Data Framework for LLM Applications: LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models (LLMs).
- Arize Phoenix - Phoenix: no description found
- Ollama - Llama 2 7B - LlamaIndex š¦ v0.10.14: no description found
LlamaIndex ā· #ai-discussion (5 messages):
- Model Decay Woes: User
@.sysfor
expressed concerns that their models have been generating insane responses recently, questioning whether models decay over time with the hypothesis that nothing else has changed in the setup. - Cheesyfishes to the Rescue:
@cheesyfishes
clarified that models do not decay over time, but longer inputs or inputs not structured as instructions could potentially lead to issues with the modelās responses. - Observable Decline in Fine-tuned Performance: Further to the decay question,
@.sysfor
noticed issues specifically with the ābetterā fine-tuned models, while running tests to compare against baseline models.
OpenRouter (Alex Atallah) ā· #general (49 messagesš„):
-
Claude Models Prompt Errors:
@quentmaker
reported an error when a chat has more than 8 alternating messages between user and assistant, affecting various Anthropicsā Claude models.@louisgv
acknowledged the issue and promised a fix is in the works. -
OpenRouter Addressing Turn Order Issues:
@alexatallah
suggested a temporary workaround for the prompt issue by changing the first assistant message to a system message. Meanwhile, development is underway to handle conversations that begin with a message from the assistant. -
Rate Limit Discussions for OpenRouter:
@gunpal5_43100
inquired about rate limits when using OpenRouter for generating large numbers of articles.@alexatallah
clarified that each user with their own API key would have separate rate limits, which cumulatively should provide sufficient throughput. -
Caching Concerns with Mistral: Several users, including
@natefyi_30842
and@spaceemotion
, observed similarities in responses when repeating prompts to Mistral models, leading to speculation of caching behavior by the API.@alexatallah
confirmed that Mistralās API might cache queries. -
Compatibility with Prepaid Cards:
@fakeleiikun
asked about OpenRouterās support for prepaid cards, particularly those provided by e-wallet apps.@louisgv
indicated that while some prepaid cards might work, virtual cards from unsupported banks might not be accepted due to Stripeās fraud prevention measures.
Links mentioned:
- no title found): no description found
- OpenRouter: Build model-agnostic AI apps
CUDA MODE ā· #triton (10 messagesš„):
- Benchmark Script Enhanced:
@hdcharles_74684
improved a benchmark script for comparing Triton kernel performance, which could be beneficial for int8 weight-only linear kernels potentially outperforming cuBLAS for batch sizes greater than 1, impacting sdxl-fast. The script is available on GitHub, and contains various kernels, including fast kernel for bs=1, int4 tinygemm, and uint4x2 triton kernel. - PR to cuda-mode/lectures Suggested:
@marksaroufim
suggested@hdcharles_74684
make a pull request to the cuda-mode lectures repository on GitHub to make the benchmark script easily accessible. - Potential Triton Optimizations Discussed:
@chhillee
mentioned that Torch.compile could efficiently handle batch size of 2, which could alleviate the main bottleneck in question. - Tensor Performance Fixed on Radeon:
@iron_bound
reported a significant improvement in tensor performance on Radeon RX 7900 XTX graphics card after fixing an issue with WMMA hooks in mlir/llvm. - Debugging Issue with Triton Versions:
@kierandidi
encountered an issue with the Triton debugger in versions 3.0.0 and 2.2.0 regarding theinterpret
argument.@andreaskoepf
and@marksaroufim
confirmed that the method was deprecated and suggested settingTRITON_INTERPRET
environment variable as a workaround. - Feedback on Tritonās Stability:
@andreaskoepf
shared experiences of instabilities with Triton compared to CUDA, citing unexplained segfaults and inconsistent results.@marksaroufim
requested an example to compare the situations before and after the segfaults, following similar feedback observed on Twitter.
Links mentioned:
- GitHub - cuda-mode/lectures: Material for cuda-mode lectures: Material for cuda-mode lectures. Contribute to cuda-mode/lectures development by creating an account on GitHub.
- script for comparing performance of several linear triton kernels across several shapes: script for comparing performance of several linear triton kernels across several shapes - linear_triton_kernels.py
CUDA MODE ā· #cuda (6 messages):
- Inquiry about GPU Intrinsics: User
@drexalt
asked if a claim made in a tweet was true, seeking clarification from fellow CUDA MODE Discord members. - Response to FP8 Intrinsics Query:
@zippika
clarified that the claim in question was false and provided a link to the CUDA math API docs that still lists FP8 intrinsics. - Clarifying the Purpose of FP8:
@zippika
underlined that FP8 serves mainly as a data format rather than being extensively used for computations.
Links mentioned:
CUDA Math API :: CUDA Toolkit Documentation: no description found
CUDA MODE ā· #torch (13 messagesš„):
-
No Appetite for Polyhedral:
@chhillee
expresses skepticism about the utility of polyhedral compilation in optimizing sharding for deep learning, suggesting that the key question is defining the cost function. -
Search Space Skepticism: In a discussion with
@andreaskoepf
,@chhillee
likens the challenge of finding optimal shardings in deep learning to the ongoing developments in new ML architectures. -
Contemplating Optimal Mappings:
@gogators.
muses that the space of valid mappings from deep learning programs to hardware may be smaller and less complex than the space of all possible deep learning programs. -
DL Program Optimization Not So Trivial:
@gogators.
backtracks from describing the process of finding efficient mappings of deep learning computations as ātrivial,ā while expressing surprise if top AI institutions arenāt already investigating this area. -
Debating Deep Learning Computability:
@telepath8401
humorously challenges@gogators.
ās initial use of ātrivial,ā prompting a clarification about the feasibility of optimizing operation mappings given homogeneity and explicit dependencies in deep learning operators.
CUDA MODE ā· #ring-attention (15 messagesš„):
- New Ring Attention Implementations:
@andreaskoepf
shared lucidrainsā implementation of Ring Attention with custom Triton kernels and proposed to compare its correctness and performance with another implementation by zhuzilin. - Backward Pass Bug Hunt:
@andreaskoepf
mentioned that Phil pointed out an issue with the backward pass, which might need fixing, as discussed in this GitHub issue. - GPU Compatibility Troubles:
@nthanhtam.
and@jamesmel
reported problems when running the Ring Attention implementation on GPUs, while@ericauld
noted the assertion script works on CPU. - Code Inconsistencies and Errors:
@ericauld
observed multiple errors in the code when trying to run it with Melvinās suggestions, such as typos and missing imports, which led to additional Triton-related issues. - Commit History Suggests Problems:
@iron_bound
hinted that something might have broken in lucidrainsā Ring Attention implementation by referring to the commit history on GitHub.
Links mentioned:
- GitHub - lucidrains/ring-attention-pytorch: Explorations into Ring Attention, from Liu et al. at Berkeley AI: Explorations into Ring Attention, from Liu et al. at Berkeley AI - lucidrains/ring-attention-pytorch
- Commits Ā· lucidrains/ring-attention-pytorch: Explorations into Ring Attention, from Liu et al. at Berkeley AI - Commits Ā· lucidrains/ring-attention-pytorch
- A ring attention with flash attention kernel implementation Ā· Issue #4 Ā· lucidrains/ring-attention-pytorch: Hi! Thank you for your work on implementing the ring attention in pytorch! Iāve just tried to implement a ring_flash_attn_qkvpacked_func (corresponding to flash_attn_qkvpacked_func in flash attentā¦
- Compare ring-flash-attention & ring-attention-pytorch Ā· Issue #11 Ā· cuda-mode/ring-attention: lucidrains & zhuzilin were hard working the last days and have completed the following two ring-attention implementations: lucidrains/ring-attention-pytorch zhuzilin/ring-flash-attention Create a ā¦
Interconnects (Nathan Lambert) ā· #news (10 messagesš„):
-
Arthur Mensch Sets the Record Straight:
@arthurmensch
clarified misconceptions about their recent announcements, reiterating the commitment to open-weight models with 1.5k H100s, a reselling agreement with Microsoft, and maintaining independence as a European company with global ambitions. He highlighted the growing interest for Le Chat and Mistral Large on La Plateforme and Azure, with a plan to iterate quickly. Check out the clarifications. -
Nathan Endorses Public Clarifications: After the tweet from
@arthurmensch
,@natolambert
expressed approval, describing the act of providing such public clarifications on social media as ādef legit vibesā. -
Announcing StarCoder2 and The Stack v2:
@BigCodeProject
launched StarCoder2, a model trained with a 16k token context and a massive 4T+ token repository-level information, built upon The Stack v2 which contains over 900B+ tokens. The code, data, and models are fully open and available, marking a significant contribution to the community. Discover StarCoder2. -
Meta Prepares to Launch Llama 3: A tweet from
@Reuters
reported that Meta plans to release a new AI language model dubbed Llama 3 in July, which could signify another major competition in the AI field. The details were reported by The Information. Read more from Reuters. -
G 1.5 Pro with Extended Context Coming to Nathan:
@natolambert
announced excitement for getting access to G 1.5 Pro with a 1 million token context, planning to use it for processing podcasts and other content, and mentioned a potential article workshop based on the experience, if thereās interest.
Links mentioned:
- Tweet from BigCode (@BigCodeProject): Introducing: StarCoder2 and The Stack v2 āļø StarCoder2 is trained with a 16k token context and repo-level information for 4T+ tokens. All built on The Stack v2 - the largest code dataset with 900B+ tā¦
- Tweet from Arthur Mensch (@arthurmensch): Clarifying a couple of things since weāre reading creative interpretations of our latest announcements: - Weāre still committed to leading open-weight models! We ask for a little patience, 1.5k H100s ā¦
- Tweet from Reuters (@Reuters): Meta plans launch of new AI language model Llama 3 in July, The Information reports http://reut.rs/3TgBgFJ
Interconnects (Nathan Lambert) ā· #random (30 messagesš„):
-
Nathan Lambert Tunes into Demis Hassabis:
@natolambert
shared an episode of a podcast with Demis Hassabis, CEO of Google DeepMind, discussing superhuman AI scaling, AlphaZero training atop LLMs, and AI governance. The podcast can be watched on YouTube or listened to on platforms like Apple Podcasts and Spotify. -
Considering Openness in AI Discussions:
@natolambert
and@mike.lambert
discussed the merits of having open conversations about completely open AI and the differences in mental models as opposed to conversations on platforms like Twitter. -
Name Coincidence Among Users: User
@xeophon.
inquired if@natolambert
and@mike.lambert
were related due to the similarity in their last names; it was confirmed to be a coincidence. -
Anthropic Association Confirmation:
@mike.lambert
confirmed employment at Anthropic and took a stance on sharing information in the chat, indicating a preference to engage in discussions as themselves, not as a representative of their employer. -
The Quest for the LAMB Emoji:
@natolambert
humorously lamented the lack of an appropriate emoji for āLAMB,ā expressing frustration with the search results pointing to a steak emoji š„©.
Links mentioned:
Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, Rogue Nations Threat: āscaling is an artformā
LLM Perf Enthusiasts AI ā· #gpt4 (2 messages):
- Inquiry About Benchmark Automation:
@ampdot
asked if a benchmark is available as an automated script, showing interest in trying out such a tool. - Enthusiasm for Benchmark Automation:
@dare.ai
also expressed interest in the automated benchmark script and is looking forward to trying it out, tagging<@757392677280022549>
for a potential response.
LLM Perf Enthusiasts AI ā· #opensource (4 messages):
- Anticipated Spring Launch for Llama 3: User
@res6969
expressed that their expectation was for Llama 3 to be released in spring, suggesting that the current timeline is further than anticipated. - Possible Last-Minute Improvements for Llama 3:
@potrock
expressed hope that the delay of Llama 3 might be due to a last-minute attention update, hinting at improvements that could be included in the release. - Enthusiasm for Gemini Ring Attention:
@potrock
mentioned that the incorporation of Gemini ring attention would be a cool feature for Llama 3, indicating interest in this specific attention mechanism.
LLM Perf Enthusiasts AI ā· #offtopic (1 messages):
- Time Crunch for LLM Testing: User
@jeffreyw128
expressed a desire to test new LLMs but emphasized the significant effort required to āget a good vibe check on eachā due to time constraints.
LLM Perf Enthusiasts AI ā· #openai (3 messages):
- ChatGPT Search Update Rumors:
@jeffreyw128
mentioned rumors that OpenAI might be updating their web search in ChatGPT this week, seeking confirmation from others. - In Search of OpenAI Insights: User
@res6969
acknowledged not having heard such rumors and expressed a need to find better sources for OpenAI-related information. - Looking for codeinterpreter Production Resources:
@res6969
inquired if anyone had resources on using codeinterpreter in production environments, indicating an interest in practical applications.
DiscoResearch ā· #general (6 messages):
-
DiscoLM Template Clarification: User
@bjoernp
pointed out the importance of using the DiscoLM template for chat context tokenization, referencing the Hugging Face documentation on chat templating. -
Issues with llamaindex Chunker for Code:
@sebastian.bodza
reported that the llamaindex chunker for code was significantly malfunctioning, producing one-liners and disregarding thechunk_lines
option. -
Sanity Check on Training German RAG Models:
@johannhartmann
is creating a German dataset for Retrieve-and-Generate (RAG) tasks, utilizing Deutsche Telekomās Wikipedia content-question pairs, and sought feedback on the approach to improve reliability of German-speaking Mistral 7b models. -
Goliath versus DiscoLM for German Language Tasks:
@philipmay
questioned if Goliath is the superior model for German language skills and shared a link to its model card on Hugging Face. The discussion evolved with@johannhartmann
suggesting that DiscoResearch/DiscoLM-120b might perform better due to its training on German content. -
Advice on Generating Negative Samples for Datasets:
@philipmay
suggested a successful method to generate negative samples by directing a language model to alter given answers to be factually incorrect, for the purposes of building a more effective dataset for RAG training.
Links mentioned:
- alpindale/goliath-120b Ā· Hugging Face: no description found
- Templates for Chat Models): no description found
DiscoResearch ā· #discolm_german (1 messages):
-
German Prompts in EQ-Bench:
@crispstrobe
shared that EQ-Bench now supports German prompts, showing strong correlation with various benchmarks like MMLU and Arena Elo. Link to the GitHub pull request is here. -
GPT-4 Leads in Performance: According to a comparison shared by
@crispstrobe
, GPT-4-1106-preview scored 81.91 in the EQ-Bench German prompts evaluation, outperforming other models including GPT-3.5, various Mistral versions, anddiscolm-german-laser
. -
Evaluating German Language Models: The message lists EQ-Bench scores for different models, highlighting that even a model like
german-assistant-v7
has a score of 35.48 which could indicate a baseline for German language model performance. -
Translation Scripts Included:
@crispstrobe
also mentioned including translation scripts with the benchmarks, stating that these were set up quickly and have the potential for further improvement, such as manual review by a student. -
Automatic Translation with GPT-4: The German prompts were automatically translated using ChatGPT-4-turbo, showing that sophisticated models can facilitate the translation of test or training sets, a process that can be adapted or changed to other translation services like āfree Geminiā.
Links mentioned:
Build software better, together:): GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.
Datasette - LLM (@SimonW) ā· #ai (4 messages):
- Struggle Against Verbose JSON Responses: User
@dbreunig
mentioned the frequent need to clean up noisy json responses but did not elaborate on the specific methods or function used. - Tackling Claudeās Introductory Phrases: User
@justinpinkney
shared a tip on how to avoid intro sentences like āSure hereās aā¦ā from Claude by using the initial characters control, referencing Anthropicās documentation. They suggested starting with<rewrite>
or enforcing the response to initiate with{
. - Claudeās Tenacious Explanations: User
@derekpwillis
acknowledged trying various methods to make Claude deliver less verbose outputs, such as forcing the AI to start with{
, yet Claude persists in providing explanations before the actual content.
Links mentioned:
Ask Claude for rewrites: If Claude gives a response that is close to, but not quite what youāre looking for, you can ask Claude to rewrite it. In Slack this can be as simple as telling Claude to āTry againā aftā¦
Skunkworks AI ā· #off-topic (1 messages):
pradeep1148: https://www.youtube.com/watch?v=ikIgy0qlif8&feature=youtu.be
Skunkworks AI ā· #general (1 messages):
- Recruitment Inquiry in DMs: User
.papahh
reached out to@1117586410774470818
with a direct message, hinting at a potential job opportunity and expressing interest in the recipientās participation.
Alignment Lab AI ā· #looking-for-collabs (1 messages):
- Exploring the Roots of Cross-Species Values:
@taodoggy
is seeking collaborators for a project aiming to understand the biological and evolutionary origins of values shared across species, refine the definition of values, and analyze how these are expressed in various cultures. They provided a brief overview with a Google Docs link.
Links mentioned:
Uncovering the Origins of Values: A Biology and Cognition-Based Approach for AI Alignment: no description found
AI Engineer Foundation ā· #general (1 messages):
- AI Engineer Recruitment Advice Sought: User
@peterg0093
is looking to start recruiting AI engineers in the UK and requests examples of good job descriptions to avoid deviating from any standard language in the field. He encourages users to reach out if they have useful references or resources.