- Altman said his top priority right now is launching the new model, likely to be called GPT-5.
- Surprisingly, Altman admitted that he āisnāt sure on the exact statusā of Sutskeverās employment.
Separately, Itamar from Codium coined Flow Engineering with AlphaCodium, picked up by Karpathy.
ā
Table of Contents
[TOC]
TheBloke Discord Summary
-
Swiss Army AI Dreamed Up: Engineers on the server discussed crafting a multi-specialty MOE model that combines seven distinct 7 billion parameter models, each specializing in areas such as law, finance, and medicine, in response to
@cos2722
ās proposal. -
8-Bit Fine-Tuning Debate:
@netrve
and@that_one_short_guy
deliberated over the necessity of 8-bit optimizers for fine-tuning, with the latter suggesting ensuring bitsandbytes is installed with GPU support for optimal functioning. -
Addressing Channel Spam Decisively: An abusive spamming user was promptly banned from the community, reflecting swift moderation actions.
-
Model Merging Dialogues: The discussion revolved around merging models with shared architecture, with practical advice provided, such as using Mergekit or ensuring models are in Alpaca format for broader compatibility.
-
AMD Optimization for AI Models: There is interest in testing the performance of AI models on AMD systems, specifically using AMD AOCL blas and lapack libraries with
llama.cpp
, to improve efficiency via AVX512 registers.
TheBloke Channel Summaries
ā· #general (1151 messagesš„š„š„):
-
AI for Command Line Tasks:
@stoop poops
shared an experiment where they gave AI access to non-interactive bash terminal shell commands to observe its actions. The AI, referred to as ātewiā, was able to perform actions likecat /etc/shadow
,nmap
the router, and even usessh-keygen
. -
Mixtralās Coding Capabilities:
@rombodawg
has been refining prompts for a Mixtral MoE model with the goal of making it surpass a 33b model in human evaluation, aiming for 13b parameter speed with better coding ability than GPT-3.5. -
LLM for Autonomous Tasks:
@selea
asked if anyone has successfully used a coding AI for tasks like writing website parsers or scripting game mob behaviors without human supervision, hinting at the possibility with enough examples and fine tuning. -
Embedded Systems and Vector Stores in LLMs:
@iukea
discussed the potential for AI enhanced by vector stores and embeddings, comparing GPT-4ās depth of knowledge to other models and the implications of using big models for practical applications. -
Performance Comparisons Amongst LLMs: Various users including
@giftedgummybee
,@iukea
, and@natepdx
compared LLMs like GPT-3.5, Mixtral, Gemini Pro, and GPT-4, discussing their strengths in depth of knowledge, problem-solving ability, response quality, and speed, especially in context of code-related tasks.
Links mentioned:
- Squidward Spongebob GIF - Squidward Spongebob Head Bang - Discover & Share GIFs: Click to view the GIF
- Release Smooth Sampling Test Build (koboldcpp) Ā· kalomaze/koboldcpp: Dynamic Temperature sampling is a unique concept, but it always peeved me that: We basically are forced to use truncation strategies like Min P or Top K, as a dynamically chosen temperature by itsā¦
- TheBloke/Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF Ā· Hugging Face: no description found
- RamAnanth1/lex-fridman-podcasts Ā· Datasets at Hugging Face: no description found
- How vector search and semantic ranking improve your GPT prompts: Improve the information retrieval process, so you have the most optimal set of grounding data needed to generate useful AI responses. See how Azure Cognitiveā¦
- GitHub - SteveJustin1963/tec-iDADmm: tec1 MINT running a digital to analog to digital repeating loop to speed calculations, eg matrix multiplication: tec1 MINT running a digital to analog to digital repeating loop to speed calculations, eg matrix multiplication - GitHub - SteveJustin1963/tec-iDADmm: tec1 MINT running a digital to analog to digitā¦
- Releases Ā· kalomaze/koboldcpp: A simple one-file way to run various GGML models with KoboldAIās UI - kalomaze/koboldcpp
- GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.: A multimodal, function calling powered LLM webui. - GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.
- sade-adrien/redpajama_v2_sample_100M Ā· Datasets at Hugging Face: no description found
ā· #characters-roleplay-stories (425 messagesš„š„š„):
- Centralized Settings for LLMs:
@firepin123
proposed the creation of a centralized platform for open-source frontendsā settings, similar to Hugging Face, with a voting system to streamline the use of LLMs by standardizing settings, improving user experience, and aiding in the debugging and benchmarking processes. - Discussion on Fine-Tuning Techniques:
@c.gato
and others discussed DPO and fine-tuning techniques for LLMs, especially regarding@c.gato
ās model, Thespis-13b.@jondurbin
recommended using rmsprop instead of adam for DPO and watching for signs of overly aggressive learning rates. - RP Character Cards by Model Creators:
@stoop poops
and@c.gato
discussed the potential benefits of model creators including default character cards, with the former expressing a preference for ānormal-ishā content, excluding ERP cards due to content sensitivity. - Exploring LLMs for Roleplay:
@netrve
shared positive experiences using Doctorās Nous-Capybara LimaRP based on Yi-32B and expressed curiosity about using DPO on it, while lamenting the high cost of fine-tuning models like WinterGoddess. - Settings Importance and Documentation: Several users, including
@theyallchoppable
,@doctorshotgun
, and@keyboardking
, discussed the importance of correct settings to obtain optimal performance from models and the need for better documentation and community-driven recommendations.
Links mentioned:
- Doubt Press X GIF - Doubt Press X La Noire - Discover & Share GIFs: Click to view the GIF
- Kquant03/FrankenDPO-4x7B-GGUF Ā· Hugging Face: no description found
- Kquant03/Prokaryote-8x7B-bf16 Ā· Hugging Face: no description found
- Robert Downey GIF - Robert Downey Jr - Discover & Share GIFs: Click to view the GIF
- Reddit - Dive into anything: no description found
- cloudyu/Mixtral_34Bx2_MoE_60B Ā· Hugging Face: no description found
- moreh/MoMo-70B-LoRA-V1.4 Ā· Hugging Face: no description found
- Ayumi Benchmark ERPv4 Chat Logs: no description found
- bagel/bagel/tune/dpo.py at main Ā· jondurbin/bagel: A bagel, with everything. Contribute to jondurbin/bagel development by creating an account on GitHub.
- c-gatomon: Weights & Biases, developer tools for machine learning
- c-gatomon: Weights & Biases, developer tools for machine learning
- medmcqa Ā· Datasets at Hugging Face: no description found
- GBaker/MedQA-USMLE-4-options Ā· Datasets at Hugging Face: no description found
- dataset (dataset): no description found
- GitHub - kbressem/medAlpaca: LLM finetuned for medical question answering: LLM finetuned for medical question answering. Contribute to kbressem/medAlpaca development by creating an account on GitHub.
ā· #training-and-fine-tuning (29 messagesš„):
-
The Vision of a Super-Swiss-AI-Knife:
@cos2722
proposed the idea of creating a multi-specialty MOE model that acts like a Swiss Army knife by combining the best 7 billion parameter specialized models. This model would address various complex requests by incorporating models such as DeepSeek7b, Open Chat 0106, Medicine Chat, Finance Chat, Law Chat, and three others of choice. -
8-Bit Optimizers for Fine-Tuning:
@netrve
received a warning about bitsandbytes being compiled without GPU support and sought clarification on the importance of 8-bit support for fine-tuning.@that_one_short_guy
clarified that 8-bit is rarely used for finetuning and recommended installing bitsandbytes with GPU support. -
Quick Ban Hammer Strikes:
@mrdragonfox
swiftly banned an abusive user, as confirmed by@netrve
, who noticed the user had spammed other channels as well. -
The Finetuning Dilemmas of Medically Minded MLX:
@cogbuji
shared challenges with instruction fine-tuning using MLX on a medical dataset, which resulted in nonsensical outputs. They contemplated switching to a self-supervised approach instead of their current supervised instruction fine-tuning method. -
Bagel Model Training, Not So Delicious:
@jondurbin
shared a loss chart of their Bagel-1.1B training, indicating a drop in evaluation loss that was not mirrored by performance, judging that the model was ācompletely braindeadā and advising against using tinyllama.@sanjiwatsuki
compared their experiment with a TinyMistral model which exhibited a higher loss.
Links mentioned:
jondurbin: Weights & Biases, developer tools for machine learning
ā· #model-merging (10 messagesš„):
- Seeking Model Merging Tools:
@givan_002
asked for scripts or resources to merge a fine-tuned 13B model using open source role-play datasets with other 13B models. - Advice on Merging Models with Same Architecture:
@kquant
advised that a 13B model can generally be merged with any model as long as they share the same architecture, such as merging Mistral with Mistral and Llama with Llama. - Ensuring Compatible Format for Merging:
@kquant
also mentioned the importance of ensuring that the model being merged follows the same format. - Mergekit As a Solution for Merging Models:
@sao10k
suggested using Mergekit for model merging needs. - Alpaca Format for Broad Compatibility:
@sao10k
explained that the Alpaca format is a āsafe universal formatā and highlighted its popularity for merging 13B models, even if the model hasnāt been trained in Alpaca format.
ā· #coding (2 messages):
- AMD Enthusiasts Wanted:
@spottyluck
is seeking individuals who are running models on AMD systems without GPUs, usingllama.cpp
, to test the AMD AOCL blas and lapack libraries. This could help leverage AVX512 registers and optimize performance. - In Search of Downloads:
@apcameron
asked where to download the AMD AOCL blas and lapack libraries needed to conduct the tests that@spottyluck
mentioned.
Nous Research AI Discord Summary
-
GPT-5 Rumors & Realism: A tweet by Sully Omarr sparked a discussion on GPT-5 with predictions of its impact and the skepticism on the novelty of multimodality. Users also debated the financial sustainability of SAAS startups running on venture capital with no subscription fees, citing a tweet questioning the business model.
-
Code Generation Innovations & AI Model Distribution: The introduction of AlphaCodium generated buzz, an open-source code generation tool surpassing humans in code contests, with its method and Github repository shared. Torrents were discussed as a potential model for AI model distribution, suggesting a decentralized dissemination method.
-
Fine-Tuning Techniques and Self-Rewarding Models: New fine-tuning techniques like SymNoise were highlighted for their ability to improve LLM performance, along with research on models that generate their own rewards, potentially leading to superhuman agents and suggesting a self-sustaining future for AI training.
-
Meta and the AI Space: Conversations about Metaās LLaMa 3 and comparisons to GPT-4 reflected anticipation of AGI advancements and strategies, including the use of GPUs and a nod to Zuckerbergās commitment to open source. The discussion touched upon the acquisition of hardware resources and potential impacts on model training capacity.
-
The Squircle Challenge & AI Aspirations: A math-related call-to-action took place around creating a squircle using bezier segments, with a Figma blog detailing the intrigue. Additionally, the personal growth story shared in a tweet from Teknium served as inspiration for AI newcomers seeking to grow their expertise in the field.
Nous Research AI Channel Summaries
ā· #off-topic (22 messagesš„):
- GPT-5 Anticipation Buzz: User
@teknium
shared a tweet suggesting GPT-5 is OpenAIās next big launch, while@max_paperclips
predicted a cycle of initial hype followed by performance nerfs. - Skepticism About Multimodality Hype:
@teknium
and@max_paperclips
conveyed disinterest in multimodality aspects thought to be central to the upcoming GPT-5, with@teknium
expressing it as āmehā and@giftedgummybee
hinting at expecting impressive performance due to available compute resources. - VC Funded SAAS Startup Costs Query: User
@0xevil
shared a tweet questioning the sustainability of a SAAS company offering with no subscription fees, leading@gabriel_syme
to comment that the goal is creating āgreat gateways,ā not necessarily products. - Proposing Torrents as a Distribution Model for AI Models:
@everyoneisgross
highlighted the potential of using torrents, as exemplified by Mistral, for distributing models, data, and instructions for machine learning applications. - Frustrated Over Misunderstandings of Model Fine-Tuning: In response to a tweet shared by
@youngphlo
expressing a claim that finetuning cannot add new knowledge to LLMs,@teknium
showed clear frustration, asserting that finetuning does indeed add knowledge, to which@youngphlo
sympathized as a justified reaction.
Links mentioned:
- Tweet from Shahul Es (@Shahules786): The RAG vs finetuning work from Microsoft assumes that finetuning can infuse new factual/domain-specific knowledge into LLMs which is not true. Finetuning is not an alternative to RAG. As of now, onlā¦
- Plink Cat GIF - Plink cat Plink Cat - Discover & Share GIFs: Click to view the GIF
- Tweet from Kaizhao Liang (@KyleLiang5): @abacaj Buy 10K of those and start your llm saas company with zero server cost. since there is no subscription, how are they not going bankrupt soon?
- Latest AI Stuff Jan 18/2024: Latest developments on AIhttps://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/https://www.reddit.com/r/LocalLLaMA/comā¦
- Tweet from Sully (@SullyOmarr): Ok so itās somewhat confirmed: āAltman said his top priority is launching the new model, likely to be called gpt5ā Expect to see a exponential leap in model capabilities with OpenAIās newest model
ā· #interesting-links (13 messagesš„):
-
AGI depicted as āSamanthaā from āHerā:
@burnytech
shared a tweet by @Schindler___ proposing an AGI architecture modeled after Samantha from the movie āHer,ā capable of dynamic speech, evolving personality traits, and external memory interaction. -
Exploring AlphaCodiumās Capabilities:
@metaldragon01
highlighted the introduction of AlphaCodium, an open-source code generation tool said to surpass most human competitors in code contests, and@teknium
inquired about whether it functions as a general coding model or an applied layer over an existing model. -
GitHub Project AlphaCodium Revealed:
@adjectiveallison
discovered AlphaCodium on GitHub, a method that enhances code generation accuracy by LLMs through a multi-stage, test-based iterative process, sparking a discussion on the use of iterative approaches in real-world applications. -
Revolutionizing LLM Fine-tuning with SymNoise:
@euclaise
and@teknium
discussed a new fine-tuning technique involving symmetric noise that improves LLMs, showing superior performance compared to prior methods on various models and datasets. -
Self-Rewarding Language Models for Superhuman Agents:
@metaldragon01
found research on Self-Rewarding Language Models where the models generate their own rewards, leading to improvements in instruction following and providing high-quality self-assessments during training.
Links mentioned:
- SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise: In this paper, we introduce a novel fine-tuning technique for language models, which involves incorporating symmetric noise into the embedding process. This method aims to enhance the modelās funcā¦
- Self-Rewarding Language Models: We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human preferā¦
- Introducing ASPIRE for selective prediction in LLMs ā Google Research Blog: no description found
- GitHub - Codium-ai/AlphaCodium: Contribute to Codium-ai/AlphaCodium development by creating an account on GitHub.
- Tweet from Itamar Friedman (@itamar_mar): š Introducing AlphaCodium - A first-of-its-kind open-source code generation tool that surpasses most human competitors in code contests āļø Inspired by DeepMindās AlphaCodeā¤ļøāš„, but beats it (jā¦
- Tweet from Schindler (@Schindler___): (1/2) Proposition of an architecture for AGI. Samantha from the movie Her is here: An autonomous AI for conversations capable of freely thinking and speaking, continuously learning and evolving. Creatā¦
ā· #general (338 messagesš„š„):
-
Social Media Bot Skepticism: In a series of messages,
@gabriel_syme
and others discussed concerns about the utility of AI in social media, suggesting that while it may work for low-quality text outputs, it isnāt the best application of AI and may lack imagination in use cases. Users also joked about Twitter botting being the only valid use case. -
AI Agentsā Future Discussed: The conversation switched to potential uses for agentic AI, including predictions of future customer service systems (
@leontello
).@.benxh
added that less human involvement in social media management could be beneficial for humanity overall but expressed reservations about effects on marketing professionals. -
Envisioning Code-Oriented AI Models: Chat participants discussed hopes for agentic AI use in code testing and development (
@_3sphere
), and the possibility of integration with multimodal models. They also commented on the challenges with certain models stopping mid-coding, highlighting a need for longer token sequences or step-by-step processing (@teknium
). -
Alignment Algorithms Compared:
@osanseviero
shared links to articles comparing DPO, IPO, and KTO alignment algorithms, concluding that DPO appears to be the best option overall but acknowledging the ease of scaling KTO due to its simpler data needs. Users discussed the correlation between various evaluations, with benchmarks and Elo scores being mentioned. -
Metaās LLaMa 3 and the Race for AGI: Metaās training of LLaMa 3 sparked a discussion on potential advancements and how it might compare to OpenAIās GPT-4. The conversation touched on the strategic use of resources like GPUs and the fascinating position of Metaās CEO as a proponent of open-source developments (
@gezegen
,@_3sphere
,@teknium
).
Links mentioned:
- Are you smarter than an LLM?: no description found
- Tweet from Edward Beeching (@edwardbeeching): In our latest blog post, we summarize our extensive evaluation of three state of the art alignment algorithms. DPO vs IPO vs KTO. The results demonstrate a complex interaction between key hyper-parameā¦
- Reports Index: no description found
- Chat with Open Large Language Models: no description found
- LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys: no description found
- Tweet from Omar Sanseviero (@osanseviero): āthe assumption is that they have diverse training amongst apart from each otherā Thatās not really the definition of experts (MoEs should really be named routed sparse models or somethinā¦
- Standing Cat Amazed Cat GIF - Standing Cat Amazed Cat Hypnotized - Discover & Share GIFs: Click to view the GIF
- Tweet from OpenLLMLeaders (@OpenLLMLeaders): New model added to the leaderboard! Model Name https://hf.co/intervitens/internlm2-base-20b-llama Overall rank: 800 Rank in 13B category: 130 Benchmarks Average: 62.69 ARC: 62.97 HellaSwag: 82.15 Mā¦
- Mixture of Experts Explained: no description found
- Tweet from AK (@_akhaliq): Meta presents Self-Rewarding Language Models paper page: https://huggingface.co/papers/2401.10020 Fine-tuning Llama 2 70B on three iterations of our approach yields a model that outperforms many exiā¦
- Mark Zuckerberg on Instagram: āSome updates on our AI efforts. Our long term vision is to build general intelligence, open source it responsibly, and make it widely available so everyone can benefit. Weāre bringing our two major AI research efforts (FAIR and GenAI) closer together to support this. Weāre currently training our next-gen model Llama 3, and weāre building massive compute infrastructure to support our future roadmap, including 350k H100s by the end of this year ā and overall almost 600k H100s equivalents of compute if you include other GPUs. Also really excited about our progress building new AI-centric computing devices like Ray Ban Meta smart glasses. Lots more to come soon.ā: 74K likes, 5,594 comments - zuck on January 18, 2024: āSome updates on our AI efforts. Our long term vision is to build general intelligence, open sourcā¦ā
- Tweet from OpenLLMLeaders (@OpenLLMLeaders): New model added to the leaderboard! Model Name https://hf.co/chargoddard/internlm2-20b-llama Overall rank: 305 Rank in 13B category: 63 Benchmarks Average: 70.61 ARC: 64.68 HellaSwag: 83.16 MMLU: 6ā¦
- Tweet from Alex Volkov (Thursd/AI) (@altryne): Just in case you donāt want to click over to the other sites, Big Zuck update - Open sourcing will continue - Currently training LLama 3 - AI + Metaverse - Will have 350,000 H100s and ~600 H100 eā¦
- Tweet from Alim (@almmaasoglu): @Teknium1 @ylecun My only question is how did they acquired so many lol
- Tweet from Archit Sharma (@archit_sharma97): @Teknium1 @huggingface Oh implementation wise it is fine, I havenāt seen a model improve meaningfully from just unpaired data. Iād love to see some experiments!
- Sparse Universal Transformer: The Universal Transformer (UT) is a variant of the Transformer that shares parameters across its layers. Empirical evidence shows that UTs have better compositional generalization than Vanilla Transfoā¦
- k-quants by ikawrakow Ā· Pull Request #1684 Ā· ggerganov/llama.cpp: What This PR adds a series of 2-6 bit quantization methods, along with quantization mixes, as proposed in #1240 and #1256. Scalar, AVX2, ARM_NEON, and CUDA implementations are provided. Why This isā¦
ā· #ask-about-llms (44 messagesš„):
- OCR for Embeddings? A Brave New World:
@_3sphere
expresses a novel concept, suggesting the ability to OCR an embedding and pondering when a neural JPG could become reality, considering embeddings as a form of codec. - Math Collaboration to Solve Geometrical Challenges:
@bernaferrari
seeks a math enthusiast to tackle the problem of representing a squircle using bezier segments, as explained in a Figma blog post. They believe a proper mathematical representation could lead to fame on Hacker News and improve the field, as current approaches lack elegance. - LLMs in Geometry Generation:
@gabriel_syme
recalls past successes in generating geometry with LLMs, noting potential for iterative generation had the models been better at the time. Meanwhile,@mr.userbox020
discusses the depth of geometry and the applicability of LLMs to mathematical problems, suggesting a simple 2D vector approach could suffice. - The Squircle Quest:
@mr.userbox020
skeptically addresses the use of LLMs for solving@bernaferrari
ās squircle problem, urging a more traditional mathematical path over complex LLMs due to nature of the problem involving irrational numbers and infinite precision. - A Journey from Novice to Pro in AI: In a shared tweet from
@Teknium1
, a remarkable one-year transformation is celebrated, inspiring@quilalove
to question where to begin their own journey into the AI field and collaborate with others on AI technical knowledge and implementation.
Links mentioned:
- Tweet from Teknium (e/Ī») (@Teknium1): Happy New Years Everybody! š„³ One year ago today, I had: - Never trained any model - Did not know the first thing about AI - Never worked in Tech - Had 8 followers on twitter? (probably) One year laā¦
- Desperately seeking squircles | Figma Blog: In a famous 1972 interview, Charles Eames answered a short sequence of fundamental questions about the nature of design.
- LoneStriker/Nous-Capybara-34B-8.0bpw-h8-exl2 at main: no description found
LM Studio Discord Summary
GPU Tango: VRAM and Resource Management in Focus: Engineers in the guild discussed GPU offload settings in LM Studio, noting that setting the GPU offload to -1 utilizes all layers but may show low GPU utilization. Recommendations were made for using Nvidia P40 GPUs as a cost-effective performance solution, and concerns were raised about potential VRAM allocation conflicts when running AI models alongside intensive applications like gaming.
LM Studio Beta V4 Debuts: Beta V4 (0.2.11 release candidate) of LM Studio has been released, featuring a model search page with VRAM fit estimates and support for new 2bit quants. Download links were provided, and it was stated that plans for open sourcing or adding a plugin system are in development, assuring LM Studio will remain free for personal use.
Dispatches from the Hardware Front: Relevant hardware discussions included power supply considerations for dual RTX 3090 setups, where a 1200W+ PSU was advised. Creative solutions for fitting large GPUs into small cases were exchanged, emphasizing the ingenuity of the engineers in optimizing their AI computing rigs.
CrewAI: Framework and Performance Insights: The CrewAI Multi-Agent Framework and its integration with the LM Studio API were highlighted, with a mention of leveraging specific agents for dedicated tasks like internet search. Benchmarks for multiple models using CrewAI were promised, along with sample code once the userās work is completed.
Model Performance and Usage: It was reported that local models, albeit operational for repeated function calls, are not as impressive as the 3.5T model. The Skyrim ChatGPT modās image recognition was spotlighted as a parallel task that competes for GPU resources with other processes. LM Studio installation issues and an unspecified model error on a 24G RAM laptop also emerged, with the latter redirected to technical support channels for further assistance.
LM Studio Channel Summaries
ā· #š¬-general (200 messagesš„š„):
-
GPU Offload and VRAM Utilization:
@heyitsyorkie
explained that setting GPU offload to -1 in LM Studio assigns all layers for GPU usage, though users like@4b0d3
reported seeing low GPU utilization.@senecalouck
shared that ROCm beta could offer significant speed improvements for AMD cards. -
Running LM Studio on Various Systems:
@heyitsyorkie
and@dagbs
discussed running LM Studio on hardware like Macbook M1/2/3 chips and compared model performances between devices, noting that LM Studio was originally designed for MacOS M1/2/3. -
Model Comparisons and Preferences: Users like
@dagbs
and@4b0d3
compared various models including Dolphin 2.6 DPO and Laserxtral, discussing preferences based on response quality and speed.@dagbs
further noted that large models like Mixtral at Q6 can experience hallucinations at higher context sizes. -
Remote Model Usage and Inference Server:
@dagbs
clarified that while LM Studio is not headless, it does have an Inference Server for remote model running. However, users like@leamac51_62244
sought discussions on using models remotely due to high hardware requirements. -
LM Studio Installation Issues:
@surrender
encountered issues with LM Studio not launching post-installation.@dagbs
suggested deleting the .cache/lm-studio folder and to seek more help in the appropriate support channel.
Links mentioned:
- HuggingChat: no description found
- LM Studio Beta Releases: no description found
- How Linux Users Install A Web Browser GIF - How Linux Users Install A Web Browser Linux Linux Users - Discover & Share GIFs: Click to view the GIF
- TheBloke/WhiteRabbitNeo-33B-v1-GGUF Ā· Not able to run this model?: no description found
- TheBloke/MegaDolphin-120b-GGUF Ā· Hugging Face: no description found
- How To Install Uncensored Mixtral Locally For FREE! (EASY): In this video, I will give you the ultimate guide on How To Install Uncensored Mixtral locally! Mixtral 8x7B, a high-quality sparse mixture of expert models ā¦
- GitHub - Significant-Gravitas/AutoGPT: AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.: AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters. - GitHub - Significant-Gravitas/AutoGPT: Autā¦
- Which devices are even supported? (HIP/ROCm) Ā· Issue #1714 Ā· ROCm/ROCm: Iām a long-time CUDA developer looking to explore ROCm and HIP development, but finding out which hardware even supports these tools is harder than it needs to be. Letās see⦠this repoāsā¦
ā· #š¤-models-discussion-chat (11 messagesš„):
- GPU Concerns with Skyrim ChatGPT Mod:
@gamerred
inquired if LM Studio needs to run on a GPU due to the Skyrim ChatGPT mod also performing image recognition.@fabguy
believes that both processes will compete for GPU resources. - LMs to Cause Minor Gameplay Hiccups:
@dagbs
explained that while compute and 3D are separate, language models (LLMs) may cause brief frame drops during their initial āthinkingā stage, but not significantly affect general gameplay. - VRAM Allocation Might be an Issue:
@fabguy
pointed out that the real issue is the allocation of vRAM, even though@dagbs
thinks games tend to ask for more resources than necessary. - Watch Out for Recommended Graphics Settings:
@ben.com
cautioned that games might not account for GPU VRAM already used by LLMs and as a result, one should consider reducing texture sizes or other settings accordingly. - LLM Background Operation During Gaming:
@dagbs
shared personal experience about running games with medium graphics requirements while keeping an LLM idle in the background, and@_anarche_
commented on maintaining high FPS in COD while running some 7B models, signaling a CPU bottleneck in their setup.
ā· #š§ -feedback (6 messages):
- Query on Beta Release Status:
@logandark
inquired about a potential delay of the new beta. Although no precise update on the release was issued, conversation suggested that the work might still be in progress. - Unspecified Model Error for User:
@aindy_niu
reported an issue while running lm-studio on a laptop with 24G RAM, facing an exit code and an unknown error. No solution was offered in the given exchanges. - Guidance Offered for Technical Support Channels: When
@aindy_niu
sought help for a model error,@dagbs
redirected them to specific Discord channels, implying a better fit for technical support there.
ā· #š-hardware-discussion (79 messagesš„š„):
- Decent Performance on a Budget:
@dagbs
and@heyitsyorkie
chimed in on the economical viability of using the Nvidia P40 for AI computing, noting it as a recommended cheap option if one has the setup to run them, offering decent performance with 24GB of VRAM, and even achieving āsingle digit tok/s with multiple p40ā for large models like the 120b Goliath. - Power Supply for Dual 3090s:
@rouw3n
inquired about the PSU requirements for a setup with a second RTX 3090, to which@heyitsyorkie
and others recommended a 1200W+ power supply, with.ben.com
suggesting a 1000W might be adequate with some tweaking. - Integrating GPUs in Tight Spaces: Users like
@dagbs
,.ben.com
, and@pefortin
shared their experiences fitting large GPUs into smaller cases by repurposing space and using PCI extenders or laying hardware against other components, highlighting creative solutions for building compact yet powerful AI rigs. - Experimenting for Optimal GPU Load:
@ericericericericericericeric
engages in a discussion about experimenting with the GPU Offload layers settings for different model sizes, with advice from@heyitsyorkie
to play around with layers and monitor VRAM usage, indicating thereās no one-size-fits-all setting. - Enhancing AI Performance in All-in-Ones:
@jilloschwortz
seeks to boost AI performance on an all-in-one PC with an i7 13700 and 16GB of RAM.@heyitsyorkie
suggests saving up for a dedicated rig while@dagbs
floats the idea of external GPU connections as a workable solution.
ā· #š§Ŗ-beta-releases-chat (37 messagesš„):
-
Beta V4 Stepping Up:
@yagilb
announced that Beta V4 (0.2.11 release candidate) is out, featuring a new model search page with VRAM fit estimates, a bug fix for text pasting, and the latestllama.cpp
commit. Users are encouraged to provide feedback on the new search page, and download links are available here. -
2bit Quant Innovation: In a brief exchange,
@n8programs
asked and@yagilb
confirmed that Beta V4 supports new 2bit quants, showcasing excitement for the update. -
ROCm Remains Separate for Now:
@_anarche_
inquired about ROCm support, to which@yagilb
replied that it is not yet integrated and will continue to be shared separately until integration is simplified. -
Plugin Possibilities on the Horizon: When
@n8programs
queried about the prospect of open sourcing LM Studio for community contributions or adding a plugin system,@yagilb
hinted that plans for this are in development. -
Always Free for Personal Use: Amidst speculation about future pricing for LM Studio,
@yagilb
assured@n8programs
that it will remain free for personal use, maintaining the current model.
ā· #autogen (1 messages):
yagilb: https://discord.com/channels/1110598183144399058/1197707651438624849
ā· #langchain (1 messages):
- Local models performing well but not āgreatā: User
@anarche_
remarked that theyāve had success with multiple local models in terms of handling function calls repeatedly. However, they noted that these models are not as impressive as the 3.5T model.
ā· #avx-beta (1 messages):
- Error: Additional Properties Not Allowed: User
@_elchupacabras
encountered an error stating āError: must NOT have additional properties. File contains unknown property: āmin_pāā and is seeking solutions.
ā· #crew-ai (10 messagesš„):
- CrewAI Multi-Agent Frameworks in Action:
@senecalouck
discussed utilizing the LM Studio API with@<bot-id>
for internet search and summarization within CrewAI. They implemented a strategy of aligning specific agents with individual tools, like search, while the rest of the crew used only LLM access. - Benchmarking Multiple Models with CrewAI: User
@_anarche_
mentioned conducting benchmarks with CrewAI, testing several models, and promised to share results and sample code for the crew setup used once completed. - Question About Dolphin DPO Score:
@dagbs
inquired about the meaning of an asterisk (*) accompanying the Dolphin DPO score, expressing a specific issue with the Dolphin setup, forgetting to install requirements. - Dolphin Modelās Minor Setbacks: In response to
@dagbs
,@_anarche_
acknowledged that the Dolphin model ādid the job but had a hiccup or two,ā hinting at some inconsistencies in performance.
Perplexity AI Discord Summary
-
Perplexity and Rabbit Unite: After a partnership with Rabbit OS, the first 100,000 Rabbit R1 purchases will include a complimentary year of Perplexity Pro, offering real-time, precise answers with the integration of PPLX online LLM API. Rabbitās tweet also emphasized natural language search enhancements for r1 users.
-
Clarity on AI Models in Use: Perplexity reassured users that Perplexity Pro indeed employs genuine models including GPT-4 and Claude 2.1, with technical specifics detailed in their Technical FAQ. In particular, Copilot uses GPT-4 for Pro users, supported by a fine-tuned version of GPT-3.5.
-
Exclusive Offers Stir Excitement: A partnership reveal has sparked excitement with a $200 free Perplexity Pro credit offered to Rabbit r1ās first 100,000 buyers, confirmed in a tweet from Jesse Lyu, highlighting that Perplexity on Rabbit r1 will be free of subscription fees.
-
Free AI Tools Entice the Community: A shared YouTube video showcases ā23 AI Tools You Wonāt Believe are Free,ā incentivizing viewers with a one-month free Skillshare trial, while another video backs Perplexity AI as the preferred choice over other tools like Google for content creation, viewable here.
-
Community Help and API Interaction: Positive community interaction is highlighted with a user expressing appreciation for having a payment method issue resolved efficiently. However, users were informed that certain specific information and features are currently not available nor planned in the development roadmap, emphasizing the need for managing expectation with current capabilities.
Perplexity AI Channel Summaries
ā· #announcements (1 messages):
- Perplexity Partners with Rabbit:
@ok.alex
announced a collaborative partnership that integrates PPLX online LLM API with Rabbit R1 for real-time, precise answers. The first 100,000 Rabbit R1 purchases come with a complimentary year of Perplexity Pro.
ā· #general (186 messagesš„š„):
- Perplexity AI Model Clarifications: Users like
@charlesalan
sought confirmation on whether Perplexity Pro uses genuine models like GPT-4 and Claude 2.1.@icelavaman
provided assurance and a link to clarify these details, affirming the authenticity of the models used. - Details on Copilot Model: In response to queries from
@gpt_five
,@icelavaman
shared a Technical FAQ detailing that Copilot uses GPT-4 for Pro users and is routed by a fine-tuned version of GPT-3.5, emphasizing its capabilities for in-depth answers. - Exciting Partnership Announcement:
@otub
revealed a partnership that provides $200 of free Perplexity Pro credit to the first 100,000 buyers of Rabbit r1, a deal confirmed by various users including@glap
and@ok.alex
, who noted the credit would extend even current Pro subscriptions. - Clarifying R1 with Perplexity Pro:
@dan9070
cited a Twitter post from@jessechenglyu
that confirmed R1 will have Perplexity on rabbit r1 for free without any need for a subscriptionāa significant boon for early adopters of the device. - User Engagement and Support:
@lkshrc
and@yogable
inquired about acquiring Pro Discord access, which was promptly resolved by@icelavaman
, showcasing the community support and responsiveness within the Perplexity AI server.
Links mentioned:
- Chat with Open Large Language Models: no description found
- Tweet from Jesse Lyu (@jessechenglyu): key msg: 1. Perplexity on rabbit r1 is FREE. 2. Perplexity offers free $200 credit as a FREE GIFT to first 100K r1 orders. 3. rabbit r1 REMAINS free of subscription. conclusion: WHAT A DEAL! āļø Quotā¦
- What is Perplexity Copilot?: Explore Perplexityās blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
- Perplexity Blog: Explore Perplexityās blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
- Chat Completions: no description found
- Perplexity - AI Companion: Ask anything while you browse
- What models does Copilot use?: Dive deep into Perplexityās technical details with our comprehensive FAQ page. From the nuances of AI models like GPT-4 and Claude 2 to token limits and AI profiles, get concise answers to optimize yoā¦
ā· #sharing (5 messages):
- Free AI Tools on YouTube:
@siddhj
shared a YouTube video titled ā23 AI Tools You Wonāt Believe are Free,ā which showcases a variety of AI tools available at no cost. The videoās description mentions a partnership with Skillshare for a one-month free trial. - Commendation for Riley Brownās Video:
@samangel7358
acknowledged the efforts of Riley Brown by applauding another informative AI-related video. - Rabbit Partners with Perplexity AI:
@br0k3r81
highlighted a new partnership between rabbit OS and Perplexity AI, shared via a tweet from @rabbit_hmi, aimed at improving the natural language search capabilities for r1 users. - Perplexity AI Service in Action:
@almost.engineering
posted a link demonstrating the search capabilities of Perplexity AI for specific content related to Gannon Makerspace. - Personal Preference for Perplexity AI:
@oneisall_
shared a YouTube video where the creator explains why they favor using Perplexity more than Google, ChatGPT, BARD, and Microsoft Copilots, particularly for content creation.
Links mentioned:
- I use Perplexity MORE than Google and ChatGPT: Main Takaways From this Video: āI use Perplexity more than ChatGPT, BARD, and Microsoft Copilots for five main reasons, including its use in content creationā¦
- 23 AI Tools You Wonāt Believe are Free: Right now, the first 500 people to use my link will get a one month free trial of Skillshare: https://skl.sh/futurepedia11231After 8 months of experimenting ā¦
- Tweet from rabbit inc. (@rabbit_hmi): At rabbit, weāre always on the hunt for top AI services and partners to help our users accomplish tasks quickly and accurately. So weāre excited to announce our partnership with @perplexity_ai to enā¦
ā· #pplx-api (6 messages):
- Gratitude for Problem Resolution: User
@rxiiia
expresses appreciation towards@830126989687914527
for assistance with a payment method issue which was resolved without the need to recreate the method. - Encouraging Community Recognition:
@Dyno
suggests using the ā emoji to react to helpful messages. Accumulating five stars sends the message to the āāstarred channel and earns the author the EXPLORER role. - Request for More Specific Instructions:
@dvrshil
asks for more specific details or instructions, expressing that the current help is inadequate. - Limitation on Information:
@icelavaman
responds to@dvrshil
with a straightforward refusal, claiming that providing the specific requested information or details is not possible. - Feature Not on the Roadmap: User
@icelavaman
informs@dvrshil
that the feature in question is not currently on the development roadmap.
OpenAccess AI Collective (axolotl) Discord Summary
-
Mistral Model Mayhem: Model performance and model training emerged as focal points, with discussions ranging from the best 7b Mistral models to use, like OpenPipe/mistral-ft-optimized-1227 and Bagel 7B, to the challenges of sample packing in models including LoRA/qLoRA and Axolotl. Users critically explored data quality and dataset effectiveness, proposing RedPajamaV2 and Dolma for model testing, and emphasized Metaās acquisition of 600,000 Nvidia H100 GPUs to illustrate the growing computational scale in AI like LLaMa 3.
-
Pack and Roll with Axolotl: In Axolotl developments), conversations focused on updating package requirements for
flash-attn
, the lack of direct configuration in DPOTrainer, and concerns over package dependency management. Users noted ColossalAIās ShardFormer as a potential step toward simplified tensor parallelism and questioned the veracity of Unslothās claims regarding training speed and VRAM efficiency. -
Plotting with Qlora and LoRA: Inquiries were made about implementing Qlora to replicate specific research results, and there was questioning about a resolved bug regarding 8bit LoRA tuning in Mixtral.
-
Dataset Utilization and Cleanup Convos: Users showed surprise over the underutilization of oasst1/2 datasets and shared effective data cleanup strategies using GPT-4 and mistral-medium. They discussed the strategic selection of training tokens, such as
<BAD>
vs<GOOD>
, emphasizing the impact of token choice on model training outcomes. -
RLHF Ruminations: Dialogues in rlhf deliberated the potential stability of an input + label + output training method over DPO, considering its utility for improving model stability, with specific mention of its use within FAANG companies.
-
Replicate Hosting and API Considerations: Queries in replicate-help touched on whether the platform supports hosting and pondered setting up API connections to models.
OpenAccess AI Collective (axolotl) Channel Summaries
ā· #general (140 messagesš„š„):
- Model Mergerās Labyrinth:
@le_mess
asked for recommendations on the best 7b Mistral model trained with chatml format amid confusion on the leaderboard.@dreamgen
and@bozoid.
discussed various merged models like OpenPipe/mistral-ft-optimized-1227 and the uniqueness of Bagel 7B, whilst expressing dissatisfaction on mixed prompt format training and data quality issues. - Sample Packing Conundrums:
@tiendung
enquired about the effectiveness of sample packing with different types of models, such as LoRA / qLoRA, while@dreamgen
discussed potential issues with Hugging Faceās implementation, particularly with attention mask and positional encoding.@tiendung
and@nanobitz
explored whether Axolotl correctly implements sample packing compared to Hugging Faceās approach. - Datasets Over Models:
@bozoid.
expressed a desire to see models tested against datasets like RedPajamaV2 and AllenAIās Dolma.@bozoid.
and@nruaif
conversed about the challenging nature of training on huge datasets and ambitions to downscale models without compromising performance. - The Might of Metaās Compute Arsenal:
@yamashi
,@noobmaster29
, and@casper_ai
discussed Metaās massive acquisition of 600,000 Nvidia H100 GPUs for training LLaMa 3, highlighting the intense scale of resources involved in state-of-the-art AI training endeavours. - DPO Training Trials and Tribulations:
@c.gato
and@dangfutures
encountered obstacles and shared experiences in applying DPO (Decentralized Parallel Optimization). Their dialogue revealed uncertainties and learning moments while attempting to improve their modelsā training processes.
Links mentioned:
- Paper page - Self-Rewarding Language Models: no description found
- Inception Deeper GIF - Inception Deeper Go Deeper - Discover & Share GIFs: Click to view the GIF
- Supervised Fine-tuning Trainer: no description found
- jondurbin/bagel-7b-v0.1 Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- OpenPipe/mistral-ft-optimized-1227 Ā· Hugging Face: no description found
- teknium/OpenHermes-2.5-Mistral-7B Ā· Hugging Face: no description found
- Non Contaminated Packing by nivibilla Ā· Pull Request #1235 Ā· huggingface/trl: As discussed in #1230 , Iāve done a quick & dirty implementation. And also included a sample notebook(not tested). Will test when I can. Or if you have time pls feel free to test and also any ā¦
- Packing in SFT Ā· Issue #805 Ā· huggingface/trl: I understand how packing is allowed in pretraining but I was looking for some clarification on how we are allowed to pack samples for SFT with ConstantLengthDataset. I see that an EOS token is put ā¦
ā· #axolotl-dev (26 messagesš„):
- Requirement Update for
flash-attn
: User@louist4455
pointed out thatflash-attn==2.3.3
might be outdated, requiring a newer version for LLM FT.@caseus_
acknowledged that upgrading is a manual process due to the lack of automated testing for multi-GPU support. - Configuration Query in DPO Cleanup Branch:
@filippob82
asked why certain parameters likemax_length
andmax_prompt_length
are not directly configurable in theDPOTrainer
.@caseus_
indicated that for most architectures in use, these settings are not crucial, but opened to adjustments following an example from a GitHub script. - Axolotl Package Dependency Concerns:
@faldore
raised a question about preventing Axolotl from installingcuda
andtorch
, which they prefer to handle independently.@caseus_
noted the need to reconsider whybert-score
was added as a dependency, while@nanobitz
advised commenting out undesired installs from the requirements. - Interest in Tensor Parallelism with ShardFormer:
@caseus_
shared a link to ColossalAIās ShardFormer, potentially hinting at easier tensor parallelism integration, pointing to the GitHub page of the ShardFormer project. - Skepticism About Unsloth Speed Claims:
@nanobitz
shared a Reddit post about Unslothās performance improvements and VRAM reduction for finetuning models.@caseus_
expressed skepticism on the marketing numbers, mentioning the ability to train in under an hour on a 3090 GPU and clarifying that transformers implemented 4d attention masks, not packing support.
Links mentioned:
- argilla/distilabeled-Hermes-2.5-Mistral-7B Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- trl/examples/scripts/dpo.py at 928d14445e31b3586ce8b73ca70ecb02dc603369 Ā· huggingface/trl: Train transformer language models with reinforcement learning. - huggingface/trl
- ColossalAI/colossalai/shardformer at main Ā· hpcaitech/ColossalAI: Making large AI models cheaper, faster and more accessible - hpcaitech/ColossalAI
- axolotl/src/axolotl/core/trainer_builder.py at acfc4ef7ddd15bf85c2feed2142ab7331694dd35 Ā· OpenAccess-AI-Collective/axolotl).): Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
ā· #general-help (2 messages):
- Planning with Qlora: User
@jacques_10431
mentioned that their team is planning to utilize Qlora in an effort to replicate the results of a particular article. - Inquiring about 8bit LoRA Tuning Bug:
@jaredquek
asked for updates regarding a bug related to 8bit LoRA tuning within Mixtral as compared to qLoRA or fft, which was initially raised by user Caseus. They are curious if the issue has been successfully fixed.
ā· #datasets (6 messages):
- Surprising Underuse of OASST Datasets:
@dreamgen
expressed surprise over the lack of models utilizing the oasst1/2 datasets and mentioned their potential after some filtering. - Inquisitive on Deep Learning Fine-tuning: In a follow-up,
@dreamgen
asked for details about training with 20 samples, including the use of DPO with QLora, learning rates, and other specifics. - Advocating for GPT-4 Data Curation:
@dreamgen
recommended investing in GPT-4 for data cleanup, highlighting its importance in contrast to the costs of fine-tuning and inference. - Clarity Through Examples:
@dreamgen
asked for examples to better understand the data cleanup goals. - Experience in Data Cleanup & Augmentation: In a reflective note,
@dreamgen
shared that mistral-medium proves to sometimes be enough or even surpass GPT-4 Turbo in certain data cleanup and augmentation tasks, while in others, GPT-3.5 Turbo outperforms mistral-medium. - Acknowledgement of GPT-4 Efficiency:
.____init___
agreed with@dreamgen
on the sensibility of using GPT-4 for a one-time data cleanup process.
ā· #rlhf (5 messages):
- Comparison between DPO and Baseline Method:
@dreamgen
discussed potential benefits of using input + label + output training, contrasting it with Direct Policy Optimization (DPO) and suggesting it as a more stable approach, mentioning its usage at FAANG companies. - Labels for Model Training:
@dreamgen
explained the concept of using tokens such as<BAD>
vs<GOOD>
to distinguish between types of responses within training data, indicating that natural tokens might be more effective than synthetic tokens in practice.
ā· #replicate-help (2 messages):
- Hosting Inquiry: User
@dangfutures
asked if the platform is meant for hosting purposes. - API Setup Possibility:
@noobmaster29
believes that setting up an API to the models is possible and considers trying it out later.
OpenAI Discord Summary
-
AI Integration in Smart Ecosystems: Enthusiasm grows as
@.pythagoras
and others discuss AI integration in smartphone models, like the Samsung S24, anticipating similar AI features in future Pixel phones. Debates unfold around Appleās ecosystem versus Samsungās AI capabilities, with predictions of AI becoming a default feature in the tech landscape. -
Ethical AI Debates and Documents: The AI community engages in discussions about AI ethics, governance, and alignment, referenced through shared links to an arXiv paper and a WHO document on the governance of AI in health.
-
GPT-4 Community Contributions and Concerns:
@serenejay
reports verification issues with GPT Store and inquires about privacy options while@marcus_73
seeks feedback for their HopeGPT, and@russellsapalmer
warns against developers ripping off GPT apps. Suggestions include domain verification to protect privacy and a call to OpenAI to monitor such activities, alongside reminders of OpenAI Status for service updates. -
Prompt Engineering Strategies and Exchanges: Experiences vary on the use of custom instructions with AI models;
@realgavok
deems them inconsistent while@darthgustav.
suggests XML tagging to improve GPT-4ās selection accuracy, sharing an XML tagging example for clearer model guidance. -
XML Tagging Takes Center Stage: In prompt engineering discussions,
@darthgustav.
advises@magiciansinc
on using XML tagging for better results over CSV or JSON. An example is provided, showcasing a way to optimize AIās performance in filtering lists by using well-structured criteria.
OpenAI Channel Summaries
ā· #ai-discussions (110 messagesš„š„):
-
AI-Enhanced Smartphones Stir Excitement:
@.pythagoras
expresses interest in the AI tools integrated into new smartphone models like Samsung S24 and hopes Google will follow suit with similar features in Pixel phones. Others share their experiences and preferences, with the conversation turning into a general discussion on the merits of Samsung vs. Apple, and the anticipation of AI becoming a staple feature in smartphones. -
AI Fridge Fantasies Spark Imagination: Chatter
@.pythagoras
humorously foresees a future where all appliances will boast āAI capabilities,ā leading to a series of creative speculations on conversational refrigerators and multi-functional vending machine-like kitchen appliances from other users. -
AI Ethics & Governance Discussion:
@clockrelativity2003
shares a link to an arXiv paper discussing AI and case law, and another link to a WHO document on the ethics and governance of AI in health, eliciting responses and discussion about AI alignment and its implications. -
Gemini Ultra Release Uncertain: In a discussion on the release of āGemini Ultra,ā
@la42099
humorously guesses it could be out in the next 30 days, with users expressing hopes for larger prompt limits and other advancements. -
The Tech Ecosystem Debate: A lively debate unfolds about Appleās ecosystem and continuity features, with
@muyfashionista
extolling the benefits of seamless integration across Apple devices. Users@mrcrack_
and@darkangel9365
chime in with opinions on Android and Samsung capabilities, citing options for customization and questioning Appleās policies on app approvals.
Links mentioned:
- Old Man GIF - Children - Discover & Share GIFs: Click to view the GIF
- Ethics and governance of artificial intelligence for health: guidance on large multi-modal models: no description found
ā· #gpt-4-discussions (38 messagesš„):
-
Verification Troubles for serenejay:
@serenejay
reported issues with not being able to complete the builder profile for GPT store publishing, despite trying different browsers and clearing cache. They managed success after subscribing via web with a card, but inquired about the possibility to not use their real name due to privacy concerns. -
Domain Verification as a Solution:
@rjkmelb
suggested that@serenejay
obtain a domain name to verify their OpenAI account after facing issues with Google Play verification.@7877
added that verifying with a domain can help hide oneās real name, showing the domain instead. -
HopeGPT Wins a Competition:
@marcus_73
shared their GPT model, HopeGPT, which won a competition for instilling hope, and requested feedback for improvement; link to the model was provided and they were guided by@solbus
to share in a dedicated channel for visibility. -
Alert on Developersā GPT Ripping: User
@russellsapalmer
raised a serious concern about the developer account tapgpts allegedly copying the work of hundreds of developers, mimicking names, logos, descriptions, and sample prompts without credit, calling for OpenAI to monitor such activities. -
ChatGPT Downtime and Communication:
@c6565
questioned why outages of ChatGPT services are not publicly communicated, to which@7877
responded by providing a link to OpenAIās status page, where operational updates and past incidents are detailed.
Links mentioned:
OpenAI Status: no description found
ā· #prompt-engineering (16 messagesš„):
-
Custom Instructions Yield Mixed Results: User
@realgavok
observed that disabling custom instructions seems to enhance consistency. This sparked discussions, with@darthgustav.
suggesting that the effectiveness of custom instructions varies heavily based on their content and structure. -
XML Tagging Boosts GPT-4ās Selection Accuracy: In a tip to
@magiciansinc
,@darthgustav.
recommended using XML tagging to improve GPT-4ās performance when sorting lists based on criteria, like picking cities ideal for a tropical vacation. The technique is claimed to be superior to using CSV or JSON formats. -
Sample XML Tagging Provided by Darthgustav.: Further assisting
@magiciansinc
,@darthgustav.
provided an example of XML tagging, listing various cities and associated activities to demonstrate how tagging could be utilized to enhance GPT-4ās output. -
Using Discord Links to Share XML Format: In an unconventional move,
@darthgustav.
directed@magiciansinc
to Discord links for examples, specifically through https://discord.com/channels/974519864045756446/1192857176562208798/1192857176562208798, which was part of the assistance provided. -
Continuing the XML Tagging Exploration:
@magiciansinc
expressed intent to test the XML tagging method and@darthgustav.
wished them luck, indicating a collaborative environment in the prompt-engineering channel.
ā· #api-discussions (16 messagesš„):
- Consistency in Custom Instructions:
@realgavok
raised a query about the effectiveness of custom instructions, noting that disabling them sometimes results in more consistency.@darthgustav.
responded, indicating that consistency varies based on the content and quality of the instructions. - Advocating for Custom GPTs:
@darthgustav.
shared their preference for exclusively using Custom Instructions or Custom GPTs, implying satisfaction with their performance. - Enhancing List Filtering with Criteria:
@magiciansinc
is looking for advice on using GPT-4 to filter lists (e.g., cities or products) based on specific criteria. They reported receiving poor suggestions and explanations from the model so far. - XML Tagging for Better Results:
@darthgustav.
advised@magiciansinc
to use XML tagging to note a cityās general properties, which should improve GPT-4ās performance. They also emphasized the importance of guiding the model properly. - Example of XML Tagging: When
@magiciansinc
asked for an XML tagging example,@darthgustav.
provided a detailed sample, suggesting it may perform better than CSV or JSON based on their testing. They also referenced an external source for generating such data.
Mistral Discord Summary
Self-hosting and API wonders with Mistral 7B: Discussions across channels showed interest in self-hosting Mistral 7B and utilizing it with Python applications, with various users offering assistance and tool suggestions. Concerns around commercial application data privacy and technical issues with quantization affecting performance were raised.
The Quandary of Long Texts: Users debated on processing long texts with Mistral and the 32K token limit. While documentation mentions this limit, the practical token cap varies based on model size and task-specific conditions.
Frustrations and Recommendations in Fine-Tuning: The community reported challenges when fine-tuning Mistral 7B, such as persistence of old prompt responses and GPU memory difficulties on an RTX 4090. Additionally, the correct implementation for Mistral in the HF trainer and finding a good GGUF format model were subjects of inquiry.
Hearty Discussions on Deployment and Tool Integration: Participants exchanged experiences with integrating tools such as Deep Chat, highlighting its simplicity over more complex setups like Open Copilot. Personal experiences related to open-source projects and international moves in the tech sector were also shared among members.
Guidance for Aspiring Coders and LLaMa Musings: Recommendations for beginner coders pointed towards Harvardās CS50 course and learning through hands-on experience. Curiosity was piqued by a Reddit discussion about Meta AIās LLaMa 3 being trained on an impressive array of 600,000 H100s.
Mistral Channel Summaries
ā· #general (31 messagesš„):
-
French Flair for Mistral 7B Self-Hosting Inquiry: User
@bot7_
asked if itās possible to self-host Mistral 7B and use it with a Python app.@kerunix
confirmed itās possible, while@tom_lrd
noted it depends on the userās OS and hardware, offering names of several relevant tools. -
Pondering Long Texts for Mistral Processing:
@lukasgutwinski
inquired about the best way to process long texts (up to 100 pages) with Mistral, and whether Mistral Medium and Small both have a 32K token window.@i_am_dom
suggested that Mixtral works effectively up to 16k tokens, but might not be stable beyond that threshold. -
Seeking Easy Chatting with Mixtral: User
@rod_____
wondered if thereās a way to chat with Mixtral by simply inserting an API key, to which@jortega_17718
responded with a link to the Hugging Face endpoints. -
Langchain with Mistral API Clarification:
@western_01
shared a success in using Mistral API with CrewAI via langchain, correcting an earlier mistake by pointing out the default API endpoint works perfectly. -
32K Token Limit Confirmation and Caveats: Both
@jortega_17718
and@sublimatorniq
addressed the alleged 32K token limit for Mistralās generative endpoints, noting that while the documentation states this, practical limits often fall short, especially for smaller models or specific tasks.
Links mentioned:
mistralai/Mixtral-8x7B-Instruct-v0.1 Ā· Hugging Face: no description found
ā· #models (22 messagesš„):
- Self-Hosting Mistral Challenges: User
@bot7_
inquired about how to self-host Mistral 7B and use it with a Python app, apologizing for their English as they are French. - Searching for Mistral 7B API:
@rohit3389
started using āMistral-7b-openorca.q4_0.ggufā via the GPT4All Python library and wondered if there is an API they could use with Python. - Clarification on Mistralās Models:
@tom_lrd
responded that third-party servers would be needed to use specific finetunes like Openorca since Mistralās API only serves models such as mistral7b-instruct. - Understanding LLMs and Seeking API Solutions:
@rohit3389
seeks a faster API solution to avoid loading a heavy 4GB model and@tom_lrd
suggests using tiny, small, and medium models through Mistralās API, which wonāt be exact in style but should be comparable or better. - Suggestions for Off-Guardrails Models:
@dizzytornado
seeks a Mistral model suitable for writing scripts with realistic, conflicted characters rather than happy, harmonious scenarios ā@vhariational
recommends the Dolphin-Mistral models with a link to their Hugging Face page and a Discord invite for further discussion.
Links mentioned:
cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser Ā· Hugging Face: no description found
ā· #deployment (8 messagesš„):
-
Self-Hosting Mistral 7B Inquiry:
@bot7_
asked if itās possible to self-host Mistral 7B for use with a Python app, despite being unsure about their English proficiency.@akshay_1
confirmed that it is possible and reassured@bot7_
about their English. -
Offering a Helping Hand in Deployment:
@akshay_1
acknowledged the complexity of self-hosting Mistral 7B and offered expertise by asking@bot7_
to check their DMs for further assistance. -
Concerns About Data Privacy with Commercial Application:
@xxtinction
expressed concerns about the utilization of Mistral 7B in a commercial application with sensitive data, questioning if the data will remain private or used by Mistral for training. They requested documentation clarification due to confusion with Mistralās Privacy Policy. -
Technical Issue with Quantization in Mistral 7B:
@lauthu
mentioned encountering an accuracy drop using TensorRT-LLM W8A8 Smooth Quant on Mistral 7B and is inquiring if others have experienced similar issues.
ā· #finetuning (7 messages):
- 7B Mistral in GGUF Format Inquiry:
@bot7_
is searching for a good 7B Mistral in GGUF format but did not receive any response to the query. - Issue with Mistral in the HF Trainer:
@bozoid.
shared concerns about an incorrect implementation for Mistral in the HF trainer affecting performance during finetuning, which has yet to be officially addressed, according to@andrewwwwme
. - Persistent Old Prompt Responses in Mistral:
@dizzytornado
reported an issue where Mistral keeps returning words from an old prompt, but did not receive a solution. - Finetuning Mistral 7B Challenges on RTX 4090:
@kaizen0340
inquired about experiences finetuning Mistral 7B with LORA on an RTX 4090, mentioning difficulties with GPU memory.@enzodeg40
responded by asking if theCUDA_VISIBLE_DEVICES
was configured correctly.
ā· #showcase (104 messagesš„š„):
- Deep Chatās Ease of Integration:
@ovi8773
celebrated the simplicity of integrating Deep Chatāwith only one line of code and no sign-up, it beats setting up full-stack alternatives like Open Copilot, which requires more extensive configuration. - Open Copilotās Complex Setup: In contrast to Deep Chatās ease of use,
@ovi8773
remarked that Open Copilotās setup process is cumbersome, despite being an open-source project with customizable options. They considered Deep Chatās superior in terms of developer convenience and implementation. - Accolades for Project Contribution: Deep Chat garnered admiration from
@ethux
, who appreciated the project enough to give it a star on GitHub, sharing enthusiasm for open-source contributions. - Discussing Global Tech Hubs: The conversation extended into a discussion about the cost of living and tech hubs across the globe.
@ovi8773
and@ethux
exchanged insights on housing prices, the appeal of various countries, and tax incentives such as the 30% tax ruling in the Netherlands. - Personal Journeys in Tech:
@ovi8773
shared their personal experience of taking a career break from Software Engineering to focus on open-source projects, as well as contemplating a move to a different country. This sparked a discussion with@ethux
about the pros and cons of relocating, especially in the context of the tech environment and living standards.
Links mentioned:
- no title found: no description found
- 30% tax ruling in the Netherlands | I amsterdam: Highly skilled migrants to the Netherlands may be eligible for the 30% tax ruling. Find out all about the benefits and the requirements.
- GitHub - openchatai/OpenCopilot: š¤ š„ Let your users chat with your product features and execute things by text - open source Shopify sidekick: š¤ š„ Let your users chat with your product features and execute things by text - open source Shopify sidekick - GitHub - openchatai/OpenCopilot: š¤ š„ Let your users chat with your product features aā¦
ā· #random (4 messages):
- Beginner Coding Advice Sought by @fufuespade: User
@fufuespade
inquired about how to start learning coding and which resources, such as forums or YouTube channels, would be recommended for beginners. - Harvard Coding Course Recommended:
@jakobdylanc
suggested checking out CS50 on YouTube, a free Harvard course with comprehensive lecture videos, for beginner coders. - Hands-On Learning Approach by @akshay_1:
@akshay_1
advised@fufuespade
to learn coding by directly implementing an idea and gaining practical experience. - Conversation on Metaās LLaMa:
@yamashi
shared a Reddit link discussing Meta AIās large language model, LLaMa, and Mark Zuckerbergās comment on training LLaMa 3 on 600,000 H100s.
Links mentioned:
Reddit - Dive into anything: no description found
HuggingFace Discord Discord Summary
-
DeciTech Drops Dual Model Delights: DeciTech released the DeciCoder-6B, supporting eight programming languages, and DeciDiffusion v2.0, an image generation model boasting 2.6x speed over Stable Diffusion v1.5. Explore DeciCoder-6B on Hugging Face and test them on Colab or Hugging Face Space.
-
FABBLER.AI Calls for Creative Testers: FABBLER.AI is seeking beta testers for an innovative tool that crafts narrative stories, convertable to videos. Check out the demo on YouTube and explore the tool on Hugging Face Space for Proteus-V0.1.
-
GPU Hosting for Heavy Models? EU Wants to Know!: A member is compiling a list of GPU hosting providers in the EU capable of supporting 13B to 70B models for tasks such as image-to-text and email triage. The request is for low latency and on-demand use, with no specific providers or solutions provided in the discussion.
-
Phi-2 Model Weights, Beware the Exclamation Invasion: After an update to the Phi-2 model, a user experienced issues with FP16 inference, leading to exclamation mark outputs, resolved by switching to
device_map="auto"
. Details for developers facing similar issues can be found here. -
Struggle of the Syntax and Model Queries in Computer Vision: Some users encountered syntax errors while training models, resolved by community suggestions, while others sought advice on object tracking without a response documented. A beginner working with Indian food datasets received guidance from peers on how to move forward.
HuggingFace Discord Channel Summaries
ā· #announcements (1 messages):
-
DeciTech Unveils DeciCoder-6B and DeciDiffusion v2.0: Two new models have been introduced: DeciCoder-6B, which supports eight programming languages and outperforms competitors in HumanEval benchmarks, and DeciDiffusion v2.0, an image generation model that is 2.6 times faster than Stable Diffusion v1.5. Examine the details on DeciCoder-6B and try them out in Colab and Hugging Face Space.
-
Revving Up Vehicle Speed Estimation: @SkalskiP presents a tutorial on real-time vehicle speed estimation, involving vehicle detection using YOLOv8, tracking with ByteTrack, and complexities of distance calculation. Catch the tutorial here.
-
Fighting Hallucinations in Language Models: A new research discusses detecting and editing hallucinations in language model outputs, introducing a retrieval-augmented model (FAVA) that outperforms ChatGPT and LLama2 Chat. Discover the taxonomy, model, and demo on the project website.
-
Art and AI: A Creative Partnership: @fffiloni writes on the critical role of art and design in advancing AI capabilities, encouraging collaboration between artists, designers, and AI researchers. Read the full article on the Hugging Face Blog.
-
Embracing French Text With Lyon NLP Group:
lyon-nlp-group
extends the Massive Text Embedding Benchmark (MTEB) to French, aiding the evaluation and comparison of text embedding methods in the French language. The detailed analysis is available in the blog post.
Links mentioned:
-
[@harpreetsahota on Hugging Face: āāš¼Two new models dropped today šš½
-
š©š¾āš» šššš¢ššØššš«-ššā¦ā](https://huggingface.co/posts/harpreetsahota/814290289723145): no description found
-
[@SkalskiP on Hugging Face: āReal-Time Vehicle Speed Estimation Tutorial ššØšØšØ
TL;DR: Watch theā¦ā](https://huggingface.co/posts/SkalskiP/421333989856413): no description found
- [@s3nh on Hugging Face: āGPU Poor POV: Building a RAG which solves specific task.
Everyone lovesā¦ā](https://huggingface.co/posts/s3nh/683576905550627): no description found
- @gsarti on Hugging Face: āš„ Todayās pick in Interpretability & Analysis of LMs: Fine-grainedā¦ā: no description found
- Breaking Barriers: The Critical Role of Art and Design in Advancing AI Capabilities: no description found
- Implementing Fractional GPUs in Kubernetes with Aliyun Scheduler: no description found
- Extending the Massive Text Embedding Benchmark to French: the datasets: no description found
- Unleashing the Power of Logprobs in Language Models: A Practical Guide: no description found
- E5 - a Hugging Face Space by Tonic: no description found
- Fast AI Image Upscaler 4x - a Hugging Face Space by FumesAI: no description found
- Andyrasika/VQA-Dataset Ā· Datasets at Hugging Face: no description found
- H94 IP Adapter FaceID SDXL - a Hugging Face Space by r-neuschulz: no description found
- Proteus V0.1 - a Hugging Face Space by ehristoforu: no description found
ā· #general (79 messagesš„š„):
-
Phi-2 Model Weights Troubles: User
@admin01234
described an issue with the Phi-2 model where after updating files, only exclamation marks were being generated. A solution mentioned was to switch fromtorch_dtype="auto"
todevice_map="auto"
in the modelās configuration. The problem along with a code snippet was discussed in this forum post. -
BERT Model Token Limits:
@redopan706
inquired about modifying the maximum token limit of the BERT Model, to which@stroggoz
suggested that they Read the documentation on huggingface, indicating that model configuration details can be found there. Another user,@vipitis
, suggested looking for a different pre-trained model with a larger context size than attempting to retrain or interpolate. -
Fine-Tuning Model Bit Size Concerns:
@samuelcorsan
sought advice on converting a model from 4-bit to 8-bit quantization. The discussion with@doctorpangloss
revealed that backpropagation with 8-bit might not be practical, and they suggested using LoRA training in bf16 or fp32 instead. -
AI Generated Portraits on macOS:
@itscharliecrown
expressed a desire to train an AI with personal images to generate portraits using the Stable Diffusion Web UI-UX. In response,@doctorpangloss
noted the feasibility of training on macOS but warned about the significantly reduced speed compared to platforms that support CUDA like Windows or Linux. -
Hugging Face System Outages: Users
@theyruinedelise
and@jo_pmt_79880
reported Hugging Face platform outages, experiencing 504 errors and website loading issues, humorously suggesting āhungry hamsters⦠nibbling on the wiresā as a possible cause for the downtime.
Links mentioned:
- microsoft/phi-2 Ā· New tokens generated with FP16 inference are only exclamation marks ā!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!ā: no description found
- GitHub - whitead/paper-qa: LLM Chain for answering questions from documents with citations: LLM Chain for answering questions from documents with citations - GitHub - whitead/paper-qa: LLM Chain for answering questions from documents with citations
ā· #today-im-learning (4 messages):
- Early Bedtime for a Knowledge Seeker: User
@mastermindfill
expressed appreciation and mentioned plans to save provided links for future use before heading off to bed. No specific links or topics were discussed in these last messages.
ā· #cool-finds (4 messages):
- SDXL V2 Models Released:
@_vargol
stated that h94 has released version 2 models for SDXL, which show improvements but still require a bias towards photorealism. - ZavyChromaXL Recommendations:
@meatfucker
mentioned having great results with the zavychromaxl models on the previous SDXL version, although they havenāt tried the new one yet. - Flexibility of Zavy Models: Continuing the discussion,
@meatfucker
noted their success in achieving both realistic and cartoony outputs using the zavychromaxl model.
ā· #i-made-this (7 messages):
-
FABBLER.AI Seeks Beta Testers:
@piotr_fabbler.ai
is calling for beta testers to try out a new AI tool designed for creating narrative stories that can be exported as videos. Interested users can contact Piotr for a unique storytelling experience and provide feedback, with a brief showcase video available here. -
Proteus-V0.1 Launched on Hugging Face Spaces:
@ehristoforu
shared a link to the new Hugging Face Space Proteus-V0.1 that runs on zerogpu.@osanseviero
commented, showing interest and inquiring about the zerogpu experience. -
Curiosity About Model Improvement: User
@merve3234
inquired whether there has been an improvement in@ehristoforu
ās model compared to the previous version that uses 1.5, indicating interest in the modelās development progress. -
Suggestion for Displaying Upscaled Images:
@lunarflu
complimented@ehristoforu
on the simple and effective nature of their model and suggested an enhancement to display both original and upscaled images side by side for a better comparison. -
AI Playgrounds and Model Experiments on GitHub:
@vishyouluck
shared their GitHub repository vishalmysore/AI which serves as a playground for AI examples using different models. They invited others to explore and share their thoughts on the repositoryās content.
Links mentioned:
- Proteus V0.1 - a Hugging Face Space by ehristoforu: no description found
- FABBLER.AI Feature Showcase: FABBLER.AI Feature Showcase
- GitHub - vishalmysore/AI: Explore the forefront of AI innovation with this dedicated repository, housing cutting-edge examples and implementations. Dive into the latest advancements, stay ahead with groundbreaking applications, and harness the power of state-of-the-art models and techniques. Elevate your understanding of artificial intelligence through hands-on work: Explore the forefront of AI innovation with this dedicated repository, housing cutting-edge examples and implementations. Dive into the latest advancements, stay ahead with groundbreaking applicatiā¦
ā· #diffusion-discussions (3 messages):
- Seeking Help for Loading Animated Models: User
@latentspace
inquired about the possibility of loading animated models from a single.ckpt
or.safetensord
file for new stable diffusion versions. In response,@sayakpaul
suggested opening a discussion on GitHub and promised to involve relevant experts in the query. - Exploring GPU Hosting Options for Large Models:
@johntdavies
is seeking advice and a comprehensive list of GPU hosting providers that support hosting 13B to potentially 70B models, with use-cases spanning image-to-text in messaging and email triage, with a preference for EU-based services. He is currently gathering data to create a proposal.
ā· #computer-vision (27 messagesš„):
-
Syntax Slip-Up: User
@swetha98
encountered an error when trying to train the donut docvqa model and shared the traceback log.@gugaime
pointed out that there might be a typo with an unnecessary backslash (\
) in the code string, and suggested adding a space. -
Object Tracking in CV:
@curiousbro
inquired about a good Python computer vision model for tracking objects and collecting data, but did not receive a response in the provided message history. -
Journey Through Notebook Troubles:
@xeus69
had issues running a notebook and installingaccelerate
, a detail highlighted by@meatfucker
who suggested reviewing error messages and ensuring the correct version is installed. The issue was eventually resolved after@xeus69
cleared the notebook cache. -
First-time Dabble in Machine Learning: Newcomer
@xeus69
mentioned being a beginner and getting assistance from@meatfucker
on initial forays into machine learning using Colab. Discussion indicated@xeus69
is working on something involving Indian food, uncovered by@meatfucker
deducing from an output directory. -
Captioning Models Discussion:
@merve3234
questioned@xeus69
ās choice of models for captioning over a more grounded model like KOSMOS-2, hinting at a need for accuracy in captions, relevant for document understanding tasks. There was no recorded response from@xeus69
on this query.
ā· #NLP (2 messages):
- Cache Configuration for Transformers in Docker:
@asprtnl_50418
provided a snippet on how to change the cache directory for Transformers within a Dockerfile by setting theTRANSFORMERS_CACHE
environment variable. They also included instructions on how to mount a volume to link the local cache to the container cache when starting a Docker container.
ā· #diffusion-discussions (3 messages):
- Inquiry about loading animated .ckpt models:
@latentspace
asked if itās possible to load animated models from a single.ckpt
or.safetensord
file, mentioning versions for SD v15 and SDXL for use with an animatediff pipeline, but did not provide further details on their setup or context. - GitHub Discussion Suggestion:
@sayakpaul
responded to@latentspace
, suggesting opening a discussion on GitHub and providing some links so that they could tag relevant contributors to assist with the question. - Looking for GPU Hosting Options:
@johntdavies
sought recommendations for a discussion group or thread regarding GPU hosting services, particularly in the EU, for running 13B and possibly 70B models, with needs varying from low latency for image to text in messaging to on-demand use for email triage and replies. They are also seeking a list of companies to prepare for a proposal.
LAION Discord Summary
-
DougDougās AI Comedy Unleashed: Discussion indicated that YouTuber DougDoug has created an AI character with NSFW elements using ChatGPT, alongside ElevenLabs for voice generation. The AI-centric comedy approach is facilitated by his open-sourced project on GitHub.
-
AI Parody Law Panic: A controversial āNo AI FRAUDā Act, seen as potentially unconstitutional, spurred discussion about its significant impact on parody and comedic AI content. An informative breakdown on the implications was provided in a Reason.com article.
-
Language Rules Linguistic Rumble: Prescriptive versus descriptive language roles in the construct of dictionaries were debated, concluding that dictionaries are considered historical recordings of language use rather than rule-enforcing entities.
-
Upscaling Video with an AI Eye: Technical discussions arose on the need for temporally-aware models for video upscaling, touching on issues like inconsistent frame details, and referenced OpenModelDB as a resource.
-
WhisperSpeech Flip for TTS: The inversion of OpenAIās Whisper model to create the Open Source text-to-speech system, WhisperSpeech, was highlighted with a related GitHub repository. Additionally, a discussion on the evaluation of multilingual LLMs and a search for a paper on continuous token embedding methods indicates ongoing research queries and advancements.
-
Unlocking Visual Vim through SSMs: A new arXiv paper introduces Vim, a vision backbone using bidirectional Mamba blocks, found here, while performance analysis of LLMs is explored in another detailed study available here.
LAION Channel Summaries
ā· #general (78 messagesš„š„):
- DougDougās AI Robot Stream - How Itās Done:
@ignizherz
and others discussed how YouTuber DougDoug managed to create an AI character with NSFW elements which he claims uses ChatGPT. It was mentioned that DougDoug uses the OpenAI API along with ElevenLabs for the voice, and he has open-sourced a similar project on GitHub. - Trouble Brewing for Parody and AI: Several users, such as
@thejonasbrothers
,@chad_in_the_house
, and@.undeleted
, shared concerns and criticisms over the potentially unconstitutional āNo AI FRAUDā Act that could seriously restrict parodies and comedic content based on First Amendment rights. A reason.com article was shared discussing the risks associated with the proposed regulation. - Language Rules Debated: A lengthy debate unfolded regarding prescriptive versus descriptive language rules, featuring users like
@mkaic
,@clock.work_
, and@atlasunified
. The conversation tackled the fluidity of language and the role of dictionaries, culminating in the recognition that dictionaries are descriptive records, not prescriptive laws. - AI Video Upscaling Discussed:
@realz
inquired about the appropriate tool for upscaling videos without causing inconsistent frame details, which led to a discussion with@pseudoterminalx
about the need for temporally-aware upscaling models and the technical aspects of video transcoding. Links and information about available upscaling models were shared, including temporal considerations. - Training WhisperSpeech for New Languages:
@__._astro_.__
asked about the requirements for training WhisperSpeech on a new language, pointing out issues with current support and higher WER (Word Error Rate) compared to English. No specific details or estimates were provided in the channel regarding the hours of audio needed for such training.
Links mentioned:
- Tweet from Soumith Chintala (@soumithchintala): Can finally talk some GPU numbers publicly š By the end of the year, Meta will have 600k H100-equivalent GPUs. Feel free to guess whatās already deployed and being used š!
- Is āIrregardlessā a Real Word?: LOL, the look on your face right now.
- Nerd Nerd Emoji GIF - Nerd Nerd Emoji Submarine - Discover & Share GIFs: Click to view the GIF
- AI fraud act could outlaw parodies, political cartoons, and moreĀ : The bill is broad enough to target a Saturday Night Live skit lampooning Trump, a comedic impression of Taylor Swift, or a weird ChatGPT-generated image of Ayn Rand.Ā
- Sassy Justice Sassy Trump GIF - Sassy Justice Sassy Trump Reindeer Election - Discover & Share GIFs: Click to view the GIF
- Peggle Speedrun, but an Ai Robot threatens me with trivia: I am the smartest youtuber, maybe ever.Streaming live on Twitch! https://www.twitch.tv/dougdougFull stream recording: https://www.youtube.com/watch?v=E8-qFR_ā¦
- Sassy Justice with Fred Sassy (Full Episode) | Deep Fake and Deep Fake: The Movie: Brought to you by Deep Fake and Deep Fake: The Movie, Fred Sassy is an American Consumer Advocate and reporter for the Cheyenne news at 9, a local TV stationā¦
- OpenModelDB: OpenModelDB is a community driven database of AI Upscaling models. We aim to provide a better way to find and compare models than existing sources.
- GitHub - DougDougGithub/Babagaboosh: App that lets you have a verbal conversation with OpenAiās GPT 4: App that lets you have a verbal conversation with OpenAiās GPT 4 - GitHub - DougDougGithub/Babagaboosh: App that lets you have a verbal conversation with OpenAiās GPT 4
ā· #research (5 messages):
-
Whisperās Inversion for Text-to-Speech:
@helium__
shared a GitHub repository named WhisperSpeech, an Open Source text-to-speech system built by inverting Whisper. -
New Paper on Vision Backbone with SSMs:
@thejonasbrothers
provided a link to an arXiv paper discussing a new vision backbone called Vim, which uses bidirectional Mamba blocks for image sequence representation and achieves high performance on various tasks. -
Authors of LLM Performance Analysis Paper Identified: In another message by
@thejonasbrothers
, they shared an arXiv paper co-authored by several individuals, showcasing their work related to Long Language Models (LLMs). -
Inquiry about Continuous Token Embedding Paper:
@JH
asked for help locating a paper that studies continuous token embedding as opposed to discrete token embedding in LLMs. -
Evaluation Methods for Multilingual LLMs:
@alyosha11
raised a question regarding what evaluation methods make sense for multilingual LLMs in the absence of existing datasets.
Links mentioned:
- Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model: Recently the state space models (SSMs) with efficient hardware-aware designs, i.e., Mamba, have shown great potential for long sequence modeling. Building efficient and generic vision backbones purelyā¦
- DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference: The deployment and scaling of large language models (LLMs) have become critical as they permeate various applications, demanding high-throughput and low-latency serving systems. Existing frameworks stā¦
- GitHub - collabora/WhisperSpeech: An Open Source text-to-speech system built by inverting Whisper.: An Open Source text-to-speech system built by inverting Whisper. - GitHub - collabora/WhisperSpeech: An Open Source text-to-speech system built by inverting Whisper.
Eleuther Discord Summary
-
Hypernets, the New Efficiency Frontier?: Discussions ensued about the potential of using hypernets to memorize weight matrices in mixture of experts (MoE) to possibly reduce parameters and boost efficiency, though no conclusive results were shared.
-
Amazon Fuels LLM Research with Resources: Amazonās call for proposals through the Amazon Research Awards was shared, offering grants and AWS credits to support LLM projects, without being an outright promotion.
-
Evaluating LLMs Across Languages: Conversations highlighted tokenization issues in non-English languages and the lack of datasets for evaluating multilingual LLMs with the dominant use of BLEU for metrics. There was also mention of Self-Rewarding Language Models paper and the Self-Rewarding approach which pushes language models beyond current systems.
-
HELM and Evaluation Harness Differences Explained: The distinction between HELM and evaluation harness was clarified ā with evaluation harness dealing with orchestration problems, and HELM outlining methodologies for evaluations. Furthermore, advice was sought on how to organize translated evaluation tasks within the eval-harness framework, which could be placed under a
tasks/translations/
directory. -
Pull Request Alerts for GPT-NeoX Devs: In the gpt-neox-dev channel, a pull request was highlighted for fixing defaults in a Docker container and a unit test for the evaluate function. There are plans to update
apex
for better Python and PyTorch compatibility, although its build time would require optimization. -
Robotic Progress and Public Participation Encouraged: Updates on Robot Kyle 2a0aās trainingānow at 140 million stepsāwere shared, and the community was invited to partake by accessing the source code to train their own versions. Participants can view live training sessions of Kyle on YouTube.
Eleuther Channel Summaries
ā· #general (28 messagesš„):
- Exploring Hypernet Efficiency: User
@Hawk
and@stellaathena
engaged in a brief discussion about using hypernets for memorizing weight matrices in mixture of experts scenarios to potentially lower parameters and improve efficiency. - Amazon Push for LLM Research:
@desik_agi
, from Amazon, shared a call for proposals through the Amazon Research Awards, offering grants and AWS promotional credits for LLM projects, and clarified it is not a promotion but an opportunity for those seeking compute resources. - Triton Custom Backend Inquiry: User
@gabriel_syme
is inquiring if anyone has experience with setting up a custom backend server for Triton. - LM Evaluation Harness Queries:
@hamelh
is seeking assistance on utilizing the eval harness to determine which tasks require logprobs and provides a GitHub search link to aid in this understanding. - Discussion on Multilingual LLM Evaluation: Users
@alyosha11
and@catboy_slim_
are contemplating evaluation metrics and datasets for testing multilingual capabilities in LLMs, with BLEU identified as a standard but datasets being predominantly in English.
Links mentioned:
Build software better, together: GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.
ā· #research (26 messagesš„):
-
Tokenization Troubles in Non-English Languages:
@xylthixlm
highlighted challenges with tokenization for non-English languages, with a particular focus on Chinese, Japanese, and Korean (CJK), which@stellaathena
confirmed. -
Mystery of Time-Aware LLMs:
@bluerune
referenced an unidentified paper or study that suggested LLMs might generate shorter token outputs when they āthinkā itās December versus May, based on a tweet showing statistically significant results. -
Alphageometry: AI Surpasses Human Mathematicians:
@the_alt_man
shared a DeepMind blog post about AlphaGeometry, an AI system that successfully solves difficult geometry problems at the level of a human Olympiad gold-medalist. -
Self-Rewarding Language Models:
@pizza_joe
introduced a paper on Self-Rewarding Language Models, outlining an approach where a language model uses LLM-as-a-Judge prompting to reward itself, resulting in performance surpassing many existing systems. The statement prompted a discussion led by@xylthixlm
and others on the potential of LLMs having sufficient information to achieve higher performance with the right tuning algorithm. -
The Paradox of Instruction Tuning:
@catboy_slim_
and@fern.bear
debated the concept of information retention in LLMs during tuning, with a focus on whether itās truly a loss of information or a failure to specifically direct the modelās output.@catboy_slim_
mentioned LoRA weights as a technique that might mitigate information loss during fine-tuning.
Links mentioned:
- Self-Rewarding Language Models: We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human preferā¦
- AlphaGeometry: An Olympiad-level AI system for geometry: Our AI system surpasses the state-of-the-art approach for geometry problems, advancing AI reasoning in mathematics
- Tweet from Rob Lynch (@RobLynch99): @ChatGPTapp @OpenAI @tszzl @emollick @voooooogel Wild result. gpt-4-turbo over the API produces (statistically significant) shorter completions when it āthinksā its December vs. when it thinksā¦
ā· #interpretability-general (1 messages):
jsai_51448: What is mech interp vs. concept interp vs. dev interp?
ā· #lm-thunderdome (9 messagesš„):
-
In Search of Clarification on Harness vs. Helm:
@aloo_kachalu
sparked a conversation regarding the comparison between evaluation harness and HELM (Holistic Evaluation of Language Models), leading to a discussion on their functionalities and the philosophy behind them. -
HELM Confusion Unraveled:
@stellaathena
clarified that the evaluation harness focuses on the orchestration problem of running eval tasks on various models, whereas HELM promotes a recommended methodology for carrying out evaluations. -
Evaluating Models in Greek:
@zoulr
shared their journey on evaluating models using Greek tasks translated from English ones like ARC and sought advice on the preferred directory format for language-specific tasks within the eval-harness repository. -
Organizing Translated Evaluation Tasks:
@hailey_schoelkopf
recommended that translated tasks could be organized under a specific directory for translations in the eval-harness tasks section, with proposals such astasks/translations/
orarc_multilingual/
. -
Specific GitHub Pull Request Shared:
@hailey_schoelkopf
posted a link to a particular GitHub pull request regarding pinning thedatasets
dependency at 2.15 in the eval-harness repository: Pindatasets
dependency at 2.15.
Links mentioned:
- Stanford Center for Research on Foundation Models: Stanford Center for Research on Foundation Models has 17 repositories available. Follow their code on GitHub.
- GitHub - stanford-crfm/helm: Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).: Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image ā¦
- Pin
datasets
dependency at 2.15 by haileyschoelkopf Ā· Pull Request #1312 Ā· EleutherAI/lm-evaluation-harness: It seems as though many users are receiving errors when upgrading to datasets versions 2.16 and above, and also because datasets on the HF hub are being replaced with Parquets in the background. Weā¦
ā· #multimodal-general (1 messages):
-
Robot Kyle Takes a Stroll:
technosourceressextraordinaire
shared an update on Robot Kyle 2a0a undergoing cooldown training runs on flat ground which might improve its motion on slopes. They mentioned the run is 20 million steps, culminating in 140 million steps, and invited others to access the source code and train their own version of Kyle at NekoCatGame/RagdollTrainer. -
Training Spectators Welcome: A live training session for Robot Kyle 2a0a is available, showcasing how to train robotic walkers using Unity Machine Learning Agents, which can be viewed on YouTube at Live AI Robot Training.
Links mentioned:
- NekoCatGame/RagdollTrainer at main Ā· cat-game-research/NekoCatGame: A game about catifu. Contribute to cat-game-research/NekoCatGame development by creating an account on GitHub.
- š» Unity 2024 ML-Agents | Live AI Robot Training | Kyle 2a0a | PyTorch | Part 11: In this video, I will show you how to train a robotic walker to cooperate with other walkers in a hostile environment using Unity Machine Learning Agents Tooā¦
ā· #gpt-neox-dev (2 messages):
- A Commitment to Fix:
@catboy_slim_
acknowledged a needed fix and expressed an intent to address it soon without specifying the issue. - Minor Changes and Fixes in Pull Request:
@catboy_slim_
highlighted a pull request that includes minor changes, such as the default output for the Docker container, as well as a fix for a unit test for the evaluate function. - Attempts to Optimize Apex:
@catboy_slim_
is looking to update theapex
version to ensure compatibility with newer versions of Python and PyTorch. However, thereās a challenge with the extended build time forapex
, which@catboy_slim_
plans to address by stripping it down in a fork.
Links mentioned:
Minor changes by segyges Ā· Pull Request #1125 Ā· EleutherAI/gpt-neox: Changes default output for docker container Renames docker pythia config to indicate it is docker pythia config Fix unit test for evaluate function
LlamaIndex Discord Discord Summary
-
RAG Gets a Turbocharge in LlamaIndex Hackathon: LlamaIndex ignites competition around Retriever-Augmented Generation with a $8,000 prize pool for their RAG-A-THON Hackathon, urging participants to register for the event to be hosted at DataStax HQ in Santa Clara, CA from February 2nd to 4th.
-
New Course Alert! Vote for LlamaIndex Learning: LlamaIndex intends to craft an online course and is conducting a poll to identify the communityās topic of interest. Community members can voice their preferences in the Twitter poll.
-
Unlocking RAGās Full Potential with Advanced Queries: LlamaIndex proposes enhancing Retriever-Augmented Generation (RAG) utilizing a query understanding layer; advancements suggested include techniques like HyDE and iterative reasoning. Details on improving RAG can be explored further in their Twitter thread.
-
Community Engineers Tackle LlamaIndexās Technical Challenges: From effective methodologies to handle large PDFs, intricacies of using metadata in fetching nodes, to technical advice on Azure Key/LlamaIndex integrations, and strategies for summarizing lengthy documentsāthe engineers shared guidance on various topics. Notable contributions include
@whitefang_jr
ās metadata query approach and@cheesyfishes
ās advice on integrating Azure keys with LlamaIndex detailed in their documentation. -
Navigating Large Documents with AI Models: In the quest to efficiently work with extensive documents for creating tables of contents and summaries, while ensuring privacy,
@takuma.fusioncloud.ai
sought community assistance.@greyman_007
recommended exploring the Zephyr model, although specific resources were not provided.
LlamaIndex Discord Channel Summaries
ā· #blog (4 messages):
-
Exploring Composable Retrieval: LlamaIndex discusses the concept of a composable hierarchy in advanced retrieval systems. Tweet explains linking smaller texts to bigger ones as part of the retrieval process.
-
Interest Gauge in LlamaIndex Course: LlamaIndex is considering creating an online course and is polling for the most important topic users want to learn about. Participate in the poll or specify further in the replies.
-
$8,000 RAG-A-THON Hackathon Announcement: LlamaIndex announces doubling the prize to $8,000 for their first in-person hackathon focused on Retriever-Augmented Generation technology. Register for the event and note that at least one team member must be present at DataStax HQ in Santa Clara, CA from February 2nd to 4th.
-
Enhancing RAG with Advanced Query Transformations: LlamaIndex suggests improving Retriever-Augmented Generation (RAG) by incorporating a query understanding layer, mentioning techniques such as HyDE, sub-question decomposition, iterative reasoning, or routing. Learn more about improving RAG.
Links mentioned:
LlamaIndex RAG Hackathon (in-person only): Think Beyond Chatbots: Unleashing the Potential of AI Agents
ā· #general (34 messagesš„):
-
PDF Page Source Tracking:
@whitefang_jr
advised@alvarojauna
to locate information about page numbers in a metadata by printingresponse.source_nodes
to handle a large PDF inquiry. -
Fetching Nodes by Metadata Query:
@whitefang_jr
responded to@vozervn
by suggesting the use ofdocstore
. Later exchanges imply difficulty in retrieving specific nodes by metadata in PGVector, but@whitefang_jr
eventually linked to a relevant section of the LlamaIndex GitHub repo for further guidance. -
Assistance with Advanced QA Tools Over LlamaIndex:
@risk_seeking
inquired about third-party tools for QA over LlamaIndex documentation and was seeking recommendations from the community. -
Azure Key Integration with LlamaIndex:
@cheesyfishes
helped@zubeen_
resolve an issue regarding the integration of Azure provided OpenAI keys with LlamaIndex by referencing documentation and suggesting the use ofAzureOpenAI
with a potentially custom httpx client for header management. -
Challenges with Summarizing Lengthy Documents:
@ben25635
sought guidance for summarizing a comprehensive 500-page report, to which@nerdai
recommended a hierarchical approach of section-wise summarization before crafting a top-level summary.
Links mentioned:
- Azure OpenAI - LlamaIndex š¦ 0.9.33: no description found
- LLM Prompt FORMATS make or break you LLM (RAG): LLM Prompt formatting essentially concerns the way in which input data or questions are structured and presented to LLMs or VLMs. The sensitivity of LLMs to ā¦
- llama_index/llama_index/vector_stores/postgres.py at fcfab6486bc6a0eec31a983dd3056ef9cbe8ceb2 Ā· run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
ā· #ai-discussion (3 messages):
- Seeking Assistance with ChatGPT for Large Documents:
@takuma.fusioncloud.ai
is looking for help on how to utilize ChatGPT for working with large documents to create tables of contents and summaries, as well as maintaining privacy for a collection of 10-12 books. - Zephyr Model Suggested for Large Document Handling:
@greyman_007
suggests using the Zephyr model with LlamaIndex on Google Colab to handle the task mentioned by@takuma.fusioncloud.ai
, but no further details or links were provided.
Latent Space Discord Summary
- AlphaCodium Debuts on GitHub: AlphaCodium, an open-source code generation tool inspired by DeepMindās AlphaCode, has been announced and released on GitHub, with details on its flow engineering in a dedicated paper.
- Karpathyās Acknowledgment and YouTube Insights: Andrej Karpathy has reviewed and acknowledged AlphaCodiumās capabilities, and further insights can be gleaned from an AI Explained YouTube video.
- Query on AlphaCodiumās IDE Plugin: Discussions include a question regarding the open-source status of the AlphaCodium IDE Plugin, which is noted to be Apache 2.0 licensed.
- Metaās Extensive GPU Deployment Plans: Meta has publicized their aim to deploy an equivalent of 600,000 H100 GPUs by the end of the current year; conversation included talk of GPU availability and prompted a reminder to loop in a key participant with a Tweet link.
- Gradient Dissent Podcast Recommended for LLM Insight: For those interested in LLM training and deployment,
@swyxio
highlights a recommendation to listen to a podcast episode of Gradient Dissent featuring Stella Biderman of EleutherAI.
Latent Space Channel Summaries
ā· #ai-general-chat (27 messagesš„):
- AlphaCodium Launches:
@itamar_mar
announced the official launch of AlphaCodium, an open-source code generation tool that competes in code contests, inspired by DeepMindās AlphaCode, and invited questions from users. The project has been published on GitHub. - Paper Discussion and Inquiry:
@slono
engaged in a discussion on the paper related to AlphaCodium, probing about the extent of prompt engineering and the effort placed on refining agent steps, resulting in a response from@itamar_mar
that 85% effort went into flow design. - Tech Community Spotlight:
@itamar_mar
shared the exciting news that Andrej Karpathy has reviewed their work on AlphaCodium, and@swyxio
congratulated them, sharing a link to Karpathyās Twitter and a relevant AI Explained YouTube video. - Tools From the Codebase:
@lightningralf
inquired about the open-source status of the AlphaCodium IDE Plugin, noting the PR-Agent is Apache 2.0 licensed. - Metaās GPU Arsenal Revealed:
@guardiang
shared a tweet from@soumithchintala
disclosing Metaās aim to deploy the equivalent of 600,000 H100 GPUs by yearās end, prompting a discussion on GPU availability and@swyxio
highlighted someone (@194927177265840128
) to loop into the conversation.
Links mentioned:
- Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering: Code generation problems differ from common natural language problems - they require matching the exact syntax of the target language, identifying happy paths and edge cases, paying attention to numerā¦
- Tweet from Andrej Karpathy (@karpathy): Prompt engineering (or rather āFlow engineeringā) intensifies for code generation. Great reading and a reminder of how much alpha there is (pass@5 19% to 44%) in moving from a naive prompt:ansā¦
- Tweet from Soumith Chintala (@soumithchintala): Can finally talk some GPU numbers publicly š By the end of the year, Meta will have 600k H100-equivalent GPUs. Feel free to guess whatās already deployed and being used š!
- Alpha Everywhere: AlphaGeometry, AlphaCodium and the Future of LLMs: Is AlphaGeometry a key step toward AGI? Even Deepmindās leaders canāt seem to make their minds up. In this video, Iāll give you the rundown of what AlphaGeomā¦
- GitHub - Codium-ai/AlphaCodium: Contribute to Codium-ai/AlphaCodium development by creating an account on GitHub.
- Tweet from GitHub - FixTweet/FxTwitter: Fix broken Twitter/X embeds! Use multiple images, videos, polls, translations and more on Discord, Telegram and others: Fix broken Twitter/X embeds! Use multiple images, videos, polls, translations and more on Discord, Telegram and others - GitHub - FixTweet/FxTwitter: Fix broken Twitter/X embeds! Use multiple imageā¦
- Tweet from Itamar Friedman (@itamar_mar): š Introducing AlphaCodium - A first-of-its-kind open-source code generation tool that surpasses most human competitors in code contests āļø Inspired by DeepMindās AlphaCodeā¤ļøāš„, but beats it (jā¦
ā· #llm-paper-club (1 messages):
- Prepping for Pythia Paper Discussion:
@swyxio
flagged an old but informative podcast episode from Gradient Dissent for an upcoming discussion on the Pythia paper, featuring an interview with Stella Biderman from EleutherAI circa 2022. Check it out for insights into LLM training and deployment: Gradient Dissent Podcast.
Links mentioned:
How EleutherAI Trains and Releases LLMs: Interview with Stella Biderman ā Gradient Dissent: Exploring Machine Learning, AI, Deep Learning, Computer Vision ā Overcast: no description found
LangChain AI Discord Summary
-
LangChain Updates Derailed by Outdated Docs:
@daslav
flagged that the LangChain documentation is outdated, with issues surrounding thefrom langchain import hub
code. This, along with@Sovok
facing an unresolved error with their RAG system and@Behlal
encountering issues with the quickstart tutorial retrieval chain on NVIDIA 4090 GPU, suggests a need for documentation review and better error diagnostics. -
Nesting Knowledge for LangServe:
@veryboldbagel
discussed advanced usage for nested information within LangServe, advocating forTypedDict
andpydantic
for precise serialization, as seen inserver.py example
. This advice aligns with their call to adopt the recently mergedastream_event
for streaming support in UI, opening possibilities for enhanced interactive systems. -
API and Frontend Synchronization in the Spotlight: LangServe users, as per insights by
@veryboldbagel
, should be mindful of theopenai_assistant
APIās requirement for more complex input beyond simple prompts and look into leveraging the Server-sent Events (SSE) web standard for streaming data on the frontend with references to Mozillaās SSE guide. -
Need for LCEL in MapReduce and Search for SQL Interface: Participants in the discussion highlighted the absence of LCEL for MapReduce, brought up by
@pramodhgopalan_80290
, signaling an impending upgrade in chain language flexibility. Concurrently,@meq__
was recommended a tool named vanna for an open-source natural language to SQL query interface, presenting a potential solution for intuitive data querying. -
Innovations and Queries in AI Design and Productivity: AI is shaping the design world with neThing.xyz (neThing.xyz) and Langsmith powering the intersection of CAD and generative AI, as
@rawwerks
seeks feedback.@_anubix
initiated a conversation about tools boosting productivity, while@esxr_
praised Olama and Langchain for revolutionizing workflows, inviting peers to their AI-centric blog (esxr.io).
LangChain AI Channel Summaries
ā· #general (14 messagesš„):
-
No LCEL for MapReduce Yet:
@pramodhgopalan_80290
inquired about the lack of LCEL (LangChain Expression Language) versions for MapReduce and Stuff Summarization, pointing to the documentation which only lists legacy chains. They found information about work in progress to create LCEL versions of all chains for easier modification and native support for streams. -
Image Retrieval from DuckDuckGo:
@solononforever3
asked if it is possible to retrieve images using the DuckDuckGo tool, but did not receive a direct response. -
Embedding Markdown Data for Chatbots:
@xery.
is planning to embed over 400 markdown files for a YouTube-based repair guide chatbot and is unsure about the optimal chunk size for embedding each markdown file separately. -
Searching for Open Source SQL Query Language Interface:
@meq__
sought an open-source natural language to SQL query interface and recalled seeing one mentioned in the channel previously.@roi_fosca
suggested the name vanna in relation to the query. -
Out-of-Date Documentation for LangChain:
@daslav
reported that the LangChain documentation appears outdated, citing code specifically involvingfrom langchain import hub
that no longer exists. -
Repeating Answers Puzzle:
@seththunder
speculated that the reason for repeated answers in a previous userās query might be due to streaming the response, though this was in the context of embedding data using a markdown text splitter. -
Looking for LangSmith Hosting and Enterprise Plans:
@muthu1823
requested contact information or advice regarding hosting their own LangSmith environment and inquired about the availability of an enterprise version or pricing. -
RAG System Error Troubles:
@Sovok
experienced an unspecified error with their RAG (Retrieval-Augmented Generation) system and shared frustration about not understanding the cause, referencing an inability to open a header file. -
Quickstart Tutorial Retrieval Chain Issue:
@Behlal
reported an error while attempting to run the retrieval chain in the quickstart tutorial using Ollama and Llama2 on a system equipped with an NVIDIA 4090 GPU and Ubuntu OS.
Links mentioned:
Chains | š¦ļøš Langchain: Chains refer to sequences of calls - whether to an LLM, a tool, or a
ā· #langserve (7 messages):
-
Nesting with
TypedDict
andpydantic
:@veryboldbagel
provided an example of nested information usage in LangServe withserver.py
. They suggest usingTypedDict
for more precision andpydantic
for object serialization, with guidance to inherit from Custom User Types. -
Detailed API Implementation Referenced:
@veryboldbagel
highlighted theopenai_assistant
APIās requirement of additional information beyond a simple prompt, sharing links to specific implementation examples through base.py at L98 and base.py at L79. -
RemoteRunnable Client for Svelte Custom UIs:
@veryboldbagel
discussed the use of Langchain-jsās remote runnable client, providing a link to the API, which facilitates the creation of custom UIs with Svelte. -
Configurable Runnables and Models: In a message by
@veryboldbagel
, the use of configurable runnables is explained as part of the LangChain Expression Language, with a prompt to discuss further in langserve for community benefit and better discoverability of solutions. -
Handling Streaming Data in Frontend:
@veryboldbagel
responded to@hiranga.g
ās query about streaming data to the frontend, suggesting starting with server-sent events (SSE) as a web standard and looking into sample applications using SSE before diving into RemoteRunnable. They shared a Mozilla resource for reference. -
Streaming Support on Langchain-Core:
@veryboldbagel
pointed to a recently merged RFC on Langchain-core that introducesastream_event
for better streaming support in UI, promising to try adding it to langserve within a week. They provided a discussion link for further details.
Links mentioned:
- šø Streaming: RFC Adding astream_event to all Runnable objects to help with streaming use cases Ā· langchain-ai/langchain Ā· Discussion #16175: Hi everyone! We want to improve the streaming experience in LangChain. Weāre considering adding a astream_event method to the Runnable interface. The code below is from the following PR and has noā¦
- langchain/libs/langchain/langchain/agents/openai_assistant/base.py at ca014d5b04b1d73fd8f0fe224def98a82600c991 Ā· langchain-ai/langchain: ā” Building applications with LLMs through composability ā” - langchain-ai/langchain
- langchain/libs/langchain/langchain/agents/openai_assistant/base.py at ca014d5b04b1d73fd8f0fe224def98a82600c991 Ā· langchain-ai/langchain: ā” Building applications with LLMs through composability ā” - langchain-ai/langchain
- langserve/examples/passthrough_dict/server.py at main Ā· langchain-ai/langserve: LangServe š¦ļøš. Contribute to langchain-ai/langserve development by creating an account on GitHub.
- GitHub - langchain-ai/langserve: LangServe š¦ļøš: LangServe š¦ļøš. Contribute to langchain-ai/langserve development by creating an account on GitHub.
ā· #share-your-work (4 messages):
-
neThing.xyz takes shape with Langsmith: User
@rawwerks
is utilizing Langsmith to facilitate tracing and evaluation of neThing.xyz, a text-to-3D generative AI aimed at CAD & engineering applications. They welcome any feedback on the project, which promises a new way to interact with AI in the field of design. -
Tools that amplify productivity: User
@_anubix
queried the community about tools that substantially increase daily productivity. -
Ollama and Langchain Revolutionize Daily Workflows:
@esxr_
shared that Ollama and Langchain have dramatically changed how they work, allowing them to build custom solutions. Theyāve also customized the Olama WebUI for their use, which significantly benefits their productivity. -
AI Enthusiast Blogs About AI Explorations:
@esxr_
mentioned their blog esxr.io, where they journal their AI findings and experiences, indicating a particular interest in the broader domains of AI and its applications.
Links mentioned:
- neThing.xyz - AI Text to 3D Model: AI powered text-to-3D models
- Pranav Dhoolia: I am an AI enthusiast, keen on exploring the vast yet interesting domain of Artificial Intelligence. I use this blog as a collaborative notepad for my findings
DiscoResearch Discord Summary
-
Neue Deutsch-sprachige Modell on the Block: DiscoLM German 7b, trained on 65 billion tokens and equipped for English, German, and translation tasks, supports RAG and function calling. Questions on performance metrics for DiscoLM German 7b were asked yet no specific benchmarking data was provided.
-
Benchmarking Emotions vs Instructions: In the benchmark_dev discussion, a potential addition of a complex reasoning section was considered to measure both emotional intelligence and complex instruction following. The surprise was shown at the high-ranking of 7b models and a discussion ensued regarding benchmarking criteria focusing strictly on emotional intelligence.
-
Longer Code Snippets Wanted: An observation was made in embedding_dev about performance drops in code documentation retrieval beyond a 512 token limit, suggesting a trial with jina encodings and extended chunk sizes.
-
Axolotl Primes for Polish: Upcoming sharing of training code and configurations for the Axolotl model was discussed in discolm_german, with a note on possible training data/code sharing and RAG-focused collaboration. User-reported glitches on a demo page received prompt attention, underscoring active support and operational intent.
DiscoResearch Channel Summaries
ā· #general (4 messages):
- Introducing DiscoLM German 7b:
@_jp1_
announced the release of DiscoLM German 7b, a model trained on 65b tokens and designed for English, German, and translation purposes. The model uniquely supports RAG applications and experimental function calling abilities. - Check out the live demo: A live demo of DiscoLM German 7b was shared by
_jp1_
, available at demo.discoresearch.org for hands-on experience. - The model gets cheeky!:
@devnull0
humorously commented that asking the model āWas geht?ā might break it, suggesting that the model has been given playful or complex inputs during testing. - Performance Benchmark Inquiry:
@cryptossssun
inquired about the benchmarking data for DiscoLM German 7b, seeking insights into its performance metrics.
Links mentioned:
- DiscoResearch/DiscoLM_German_7b_v1 Ā· Hugging Face: no description found
- DiscoLM German 7b Demo: no description found
ā· #benchmark_dev (4 messages):
-
Mixtralās Improved Performance in New Version:
@.calytrix
responds to@_jp1_
highlighting that Mixtral performs more competently in the latest version as opposed to the first version. -
Surprise at 7b Models Ranking High:
@_jp1_
expresses surprise at the high ranking of 7b models like Beagle compared to Mixtral instruct and requests an example of Beagleās superior performance. -
Clarification on Benchmarking Criteria:
@.calytrix
clarifies to@_jp1_
that while individual question analysis might not be entirely indicative of total performance, the critique section can be insightful. The benchmarks are tailored to assess emotional intelligence strictly, not complex instruction following. -
Potential Enhancement of Benchmarking Methodology:
@.calytrix
mentions to@_jp1_
the possibility of adding a complex reasoning section to the test to create a combined score that measures both emotional intelligence and complex instruction following.
ā· #embedding_dev (1 messages):
- Code Documentation Retrieval Performance Drop:
@sebastian.bodza
observed that performance in code documentation retrieval declines after aggressive truncation linked to the 512 token limit. An experiment with jina encodings and longer chunk sizes is to be tried next.
ā· #discolm_german (13 messagesš„):
- Intrigued by Open Source Initiative:
@philipmay
expressed gratitude for sharing open-sourced work and showed interest in several aspects of the project, posing multiple questions. - Axolotlās Training and Code Reveal on Horizon:
@_jp1_
confirmed plans to share both the training code/configuration for Axolotl and a repository with advanced usage examples; however, they noted the necessity for more time to present it cleanly. - Training Data Sharing Potential: In response to
@philipmay
,@_jp1_
revealed that sharing the training data and code, especially concerning RAG, is possible, highlighting involvement from<@1048301853806448680>
and mentioning ongoing improvements and potential collaboration. - Tackling AIās Rejection Responses: Addressing
@maxidl
ās experience of Axolotl emitting a rejection response,@_jp1_
acknowledged efforts to filter these out and encouraged reporting them to enhance future iterations. - Demo Page Glitches and Recoveries:
@devnull0
compliments the demo page, later reports a Cloudflare Origin DNS error, but@_jp1_
swiftly indicates the issue has been resolved, signaling the page is operational again.
Skunkworks AI Discord Summary
-
Demand for a Functional Dataset: Users @interstellarninja and @yikesawjeez discussed the need for a refined function calling dataset to align with OpenAIās conventions, emphasizing the compatibility requirement with the OpenAI API for an open-source function caller.
-
Probing LLM Inference Cost Dynamics: While @helium0120 sought data on LLM inference cost trends, @nisten provided a cautionary note on the complexities of cost calculations, flagging potential subsidies by API services as confounding factors.
-
Scrutiny of Lookahead Decoding Method: @nisten provided a critical assessment of the lookahead decoding method, recognizing its limitations but noting its efficacy in specific scenarios like code editing. Contributions included a link to a detailed blog post exploring the methodās use for LLM inference acceleration.
-
Off-Topic Exchange: User pradeep1148 shared a non-technical YouTube video link, which didnāt relate to the technical and engineering discussions of the guild.
Skunkworks AI Channel Summaries
ā· #general (9 messagesš„):
-
Function Calling Dataset Challenges:
@interstellarninja
acknowledged the limitations of existing datasets and expressed the need for a diverse function calling dataset that aligns with OpenAIās function signatures and calling schema. This would facilitate compatibility with the OpenAI API, making the open-source function caller easily swappable. -
Function Caller Search Continues:
@yikesawjeez
realized the limitations of the existing dataset and expressed an intention to look for a more suitable one that matches OpenAIās needs. -
Seeking LLM Inference Cost Trends: User
@helium0120
inquired about any available data on trends or forecasts concerning the decrease in LLM inference costs over time. -
Skepticism over LLM Inference Cost Reduction:
@nisten
commented that inference cost calculations are challenging due to API services potentially subsidizing those costs, casting doubt on straightforward cost reduction trends. -
Lookahead Decoding Method Evaluated:
@nisten
critically evaluated the lookahead decoding method, finding it not as effective as claimed except in certain scenarios such as code editing where re-outputting the entire code with small edits is required. Accompanying the discussion, a link to a blog post (Lookahead Decoding: Accelerating LLM Inference) was provided, which offers a deep dive into the methodās approach to accelerating LLM inference.
Links mentioned:
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org:
TL;DR: We introduce lookahead decoding, a new, exact, and parallel decoding algorithm to accelerate LLM inference. Lookā¦
ā· #off-topic (1 messages):
pradeep1148: https://www.youtube.com/watch?v=POgLwYxDGYk
LLM Perf Enthusiasts AI Discord Summary
- Swapping Models for Smooth Sailing:
@jeffreyw128
mentioned that they switch between different GPTs to avoid issues, while@thebaghdaddy
is considering running processes without advanced analytics as a workaround. - Instruct Model Preferred by User:
thisisnotawill
indicated using the instruct model from anyscale without additional context. - Call for Data Synthesis Insights:
@ayenem
is looking for resources on productionizing a data synthesis model, but the community response was absent. - Mulling Over MLOps Channel:
@ayenem
suggested the creation of a #mlops channel, which@pantsforbirds
found potentially useful despite referring to MLOps as the ābane of my existence.ā@jeffreyw128
questioned the necessity of a separate MLOps channel. - Azure Filter Toggle Trouble:
@thisisnotawill
sought help regarding how to disable content filters in Azure, noting the seeming restriction of the feature to internal use; subsequent discussion or resolution was not indicated.
LLM Perf Enthusiasts AI Channel Summaries
ā· #gpt4 (3 messages):
- Swapping GPTs to Avoid Annoyances:
@jeffreyw128
mentioned that to circumvent certain issues, they opt to use different GPTs. - Analytic Workaround Strategy: In response to
@jeffreyw128
,@thebaghdaddy
considered the suggestion and decided to run their processes without advanced analytics as a potential solution.
ā· #opensource (1 messages):
thisisnotawill: yeah im using the instruct model from anyscale
ā· #resources (1 messages):
- In Search of Data Synthesis Wisdom:
@ayenem
is seeking experiences or resources such as blogs, books, or tools for productionizing a data synthesis model. There were no responses provided in the message history.
ā· #feedback-meta (3 messages):
- MLOps Channel Proposal: User
@ayenem
inquired if others would be interested in a #mlops channel, suggesting there might be community demand for such a space. - MLOps: A Likely Read:
@pantsforbirds
humorously referred to MLOps as the ābane of my existenceā but expressed interest in reading helpful posts if a #mlops channel was created. - Debating MLOps Channelās Necessity: In response,
@jeffreyw128
asked what types of discussions would be had in a #mlops channel that wouldnāt already fit in the current ones.
ā· #openai (1 messages):
- Azure Content Filter Confusion: User
@thisisnotawill
inquired about disabling content filters in Azure, mentioning that the option seemed restricted to internal use only. No solutions or follow-up discussions were provided in the given history.
Alignment Lab AI Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
imonenext: Does anyone have a Gemini Pro key?
Datasette - LLM (@SimonW) Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- PR for Offline GPT-4All Model Usage:
@cameron_y
opened a pull request to enable offline usage for gpt4all models, addressing an issue where the library would attempt to download a model even if it already exists locally. This fix is detailed in PR #18 on GitHub.
Links mentioned:
fix: allow local models to work without internet connection by hydrosquall Ā· Pull Request #18 Ā· simonw/llm-gpt4all: Motivation Currently, the library tries to download the model even if it already exists locally, which prevents offline use. Fixes #10 , applying a code hint and investigation from @rotterb Changesā¦
The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.