First teased via paper 2 months ago, Nightshade was the talk of the town this weekend:
However people digging in the details have questioned how it works and the originality:
ā
Table of Contents
[TOC]
TheBloke Discord Summary
-
MoE Efficiency and Detection Tools Talk: In discussions around Mixture of Experts (MoE) models, efficiency in GPU parallelism, and quant methods were key topics, with users exploring variable routing and trade-offs between expert counts. Also, GPTZeroās ability to detect certain types of AI-generated content was analyzed, suggesting noise application as a potential evasion method.
-
Challenges in Role-Playing AI: Debates emerged over Solarās effectiveness, with some users pointing out its poor alignment despite benchmark efficiency. Model performance in long-context roleplaying was discussed, with opinions split on the best models for tasks and the potential for emergent repetition issues that can cause loss of novelty in output.
-
Fine-Tuning and Quantization Strategies in Depth: Users exchanged experiences with fine-tuning language models such as Mistral 7B, with some choosing few-shot learning over fine-tuning due to limited data. The concept of community-powered quantization services was pitched, and the need for simpler quantization methods was underscored, arguing for a focus on model improvement rather than complex distributed computing for quantization.
-
Confusion and Community Exchanges in Model Merging: An exchange on model merging strategies revealed confusion over non-standard mixing ratios with Mistral-based models. Different blending techniques like task arithmetic and gradient slerp were suggested, cautioning against blind copying of values.
-
Community Interest in Quantization and Model Training: Users expressed a desire for an easy community-driven quantization service, paralleling familiar processes like video transcoding. In model training, the feasibility of training on a 50GB corpus dealing with religious texts was queried, showing interest from newcomers in leveraging existing open-source models for specific domains.
TheBloke Channel Summaries
ā· #general (963 messagesš„š„š„):
-
Exploring MoE and LLMs: Users discussed the efficiency of using experts in mixture of experts (MoE) models and the implications it has on GPU parallelism.
@kalomaze
talked about variable routing in MoE for parallelizing tasks and the trade-off between using more or fewer experts. -
The Complexity of Enhancing MoE Models: The nuances of enhancing MoE were dissected, with
@kalomaze
questioning the benefit of layers becoming simpler.@selea
proposed using lots of experts as they could work as a library of āLoRasā to prevent catastrophic forgetting. -
Challenges with AI Detection Tools: Users debated the efficiency of the GPT detection tool,
GPTZero
, with@kaltcit
noting that while common samplers can be detected byGPTZero
, applying noise seems to be a potential method to dodge detection. -
Adventures in Fine-Tuning:
@nigelt11
discussed the hurdles of fine-tuningFalcon 7B
with a dataset of 130 entries, considering switching to useMistral
instead and understanding the nuances between āstandardā and āinstructā models for RAG-based custom instructions. -
The Ethical Ambiguity of AI Girlfriend Sites:
@rwitz_
contemplated the ethics of AI girlfriend sites, exploring the idea and finally deciding to pivot to a more useful application of AI technology beyond exploiting loneliness.
Links mentioned:
- Can Ai Code Results - a Hugging Face Space by mike-ravkine: no description found
- A Beginnerās Guide to Fine-Tuning Mistral 7B Instruct Model: Fine-Tuning for Code Generation Using a Single Google Colab Notebook
- Big Code Models Leaderboard - a Hugging Face Space by bigcode: no description found
- budecosystem/code-millenials-13b Ā· Hugging Face: no description found
- First Token Cutoff LLM sampling - <antirez> : no description found
- How to mixtral: Updated 12/22 Have at least 20GB-ish VRAM / RAM total. The more VRAM the faster / better. Grab latest Kobold: https://github.com/kalomaze/koboldcpp/releases Grab the model Download one of the quants aā¦
- GitHub - iusztinpaul/hands-on-llms: š¦ šš²š®šæš» about ššš š, ššš š¢š½š, and šš²š°šš¼šæ ššš for free by designing, training, and deploying a real-time financial advisor LLM system ~ š“š°š¶š³š¤š¦ š¤š°š„š¦ + š·šŖš„š¦š° & š³š¦š¢š„šŖšÆšØ š®š¢šµš¦š³šŖš¢šš“: š¦ šš²š®šæš» about ššš š, ššš š¢š½š, and šš²š°šš¼šæ ššš for free by designing, training, and deploying a real-time financial advisor LLM system ~ š“š°š¶š³š¤š¦ š¤š°š„š¦ + š·šŖš„š¦š° &amā¦
- GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs: A fast inference library for running LLMs locally on modern consumer-class GPUs - GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs
- Noisy sampling HF implementation by kalomaze Ā· Pull Request #5342 Ā· oobabooga/text-generation-webui: A custom sampler that allows you to apply Gaussian noise to the original logit scores to encourage randomization of choices where many tokens are usable (and to hopefully avoid repetition / loopingā¦
- GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
- Add dynatemp (the entropy one) by awtrisk Ā· Pull Request #263 Ā· turboderp/exllamav2: Still some stuff to be checked, heavy wip.
ā· #characters-roleplay-stories (403 messagesš„š„):
-
Solarās Status as a Benchmark Chad:
@doctorshotgun
described Solar as efficient in benchmarks but terrible in actual use, with problems like alignment issues akin to ChatGPT. However,@theyallchoppable
defended its utility in role-playing scenarios, citing its consistent performance. -
Model Comparison in Roleplay Quality:
@sanjiwatsuki
and@animalmachine
discussed how models like Mixtral, 70B, Goliath, and SOLAR perform in roleplaying tests, with mixed opinions. New models and finetuning strategies, like Kunoichi-DPO-v2-7B, were suggested to potentially improve coherence and character card adherence. -
Long Context Handling: Users reported on modelsā performance with long context lengths, noting that some like Mistral 7B Instruct lose coherence beyond certain limits. Subsequent discussions involved tips on efficiency and hardware requirements for running large-scale models.
-
Deep Dive into Quant Methods: There was a detailed discussion on quantization strategies, including sharing links to repositories for GGUF models.
@kquant
provided insights into the potential performance in ranking systems. -
Emergent Repetition Issues in MoE Models:
@kquant
expressed that multitudes of models working together tend to generalize and might become repetitive, likening it to a choir stuck on a chorus. A new model with a specialized design to combat repetition in creative scenarios is underway.
Links mentioned:
- Urban Dictionary: kink shame: To kink shame is to disrespect or devalue a person for his or her particular kink or fetish.
- LoneStriker/airoboros-l2-70b-3.1.2-5.50bpw-h6-exl2 Ā· Hugging Face: no description found
- Kquant03/Umbra-MoE-4x10.7-GGUF Ā· Hugging Face: no description found
- athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW-v1-SHUFFLED Ā· Datasets at Hugging Face: no description found
- TheBloke/HamSter-0.1-GGUF Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- Kooten/Kunoichi-DPO-v2-7B-8bpw-exl2 at main: no description found
- Undi95/Borealis-10.7b-DPO-GGUF Ā· Hugging Face: no description found
- brittlewis12/Kunoichi-DPO-v2-7B-GGUF Ā· Hugging Face: no description found
ā· #training-and-fine-tuning (12 messagesš„):
- Newbie Diving into LLMs:
@zos_kia
, a self-proclaimed noob, is seeking advice on training a language model on a 50GB corpus of unstructured religious and esoteric texts. They are considering open-source models like trismegistus-mistral and inquiring about the feasibility of training on a home computer as well as the expected time frame. - Pinging For Insights:
@zos_kia
asks if it is okay to ping the creator of trismegistus-mistral in the Discord server for personalized advice on their training project. - Voicemail Detection Finetuning Inquiry:
@rabiat
is looking for guidance on fine-tuning Mistral 7B or MoE to classify voicemail announcements and is curious about the required dataset size for efficient LoRA fine-tuning. They are considering using their 40 real voicemail examples as seeds to upsample. - Few-shot as an Alternative:
@gahdnah
suggests that@rabiat
could try few-shot learning as an alternative to fine-tuning for the voicemail classification task. - Quantized Models and Fine-tuning:
@sushibot
shared a skeleton script showcasing the process of quantizing a model to 4-bit before attaching LoRA weights andqueried about the setup.@sanjiwatsuki
confirmed that this is indeed what āQā in QLoRA implies, suggesting the fine-tuning of frozen weights in a quantized model. - Benchmark Blogpost Showcase:
@superking__
shared a Hugging Face blog post that evaluates three language model alignment methods without reinforcement learning: Direct Preference Optimization (DPO), Identity Preference Optimisation (IPO), and Kahneman-Tversky Optimisation (KTO) across various models and hyperparameter settings.
Links mentioned:
Preference Tuning LLMs with Direct Preference Optimization Methods: no description found
ā· #model-merging (15 messagesš„):
- Blizado Explores Non-Standard Merging:
@blizado
is looking to merge two Mistral-based models using a 75:25 ratio instead of the standard 50:50. They found that a 50:50 slerp merge was too biased towards one model. - Sao10k Suggests Merging Flexibility:
@sao10k
recommended that@blizado
try different merge methods such as gradient slerp, task arithmetic, or DARE-TIES, emphasizing not to stick with default values. - Confusion Over Merging Parameters: Despite the suggestions,
@blizado
expressed confusion over the merging parameters and their effects on the modelās language output. - Sao10k Clarifies on Merging Values: In response to issues faced by
@blizado
including a model switching between German and English,@sao10k
advised against copying values blindly and suggested a simple gradient slerp ranging from 0.2 to 0.7. - Blizadoās Troubles with Mixed Models: After trying a slerp parameter found on a Hugging Face model,
@blizado
reported difficulty seeing differences when merging two different base models and suggested a certain merge effectiveness when combining a solid language base model with one of high language understanding in the same language.
ā· #coding (8 messagesš„):
-
A Call for Simplified Model Quantization:
@spottyluck
expressed surprise at the lack of āuber bulk/queue based model quantization solutions,ā considering their extensive experience in video transcoding. They suggest the potential for a community service that allows easy model quantization with an opt-out feature for shared computing power. -
Quantization Service: A Community Effort?: Following up,
@spottyluck
floated the idea of a community-powered distributed model quantization service where users could contribute to a communal compute resource while working on their own projects. -
Simplicity Over Complexity:
@wbsch
countered by highlighting that most users prefer convenience and consistency, as provided by TheBloke, without the need for complex solutions like quantization farms or distributed compute services. -
Farming for Models Not Quants:
@kquant
emphasized that community compute donations should be targeted at long-term research and model improvement, rather than the quantization process. -
Technical Inquiry on Checkpoint Changes in Stable Diffusion:
@varient2
asked for assistance on how to programmatically change checkpoints in Stable Diffusion using the webuiapi, mentioning they have already figured out how to send prompts and use ADetailer for face adjustments mid-generation.
Nous Research AI Discord Summary
-
WSL1 Surprises with 13B Model:
_3sphere
found that a 13B model can be successfully loaded on WSL1 despite an earlier segmentation fault with the llama.mia tool. -
ggml Hookās 7b Model Limitation Unveiled: The ggml hook faced criticism for not being documented to work exclusively with 7b models, a discovery made by
_3sphere
. -
SPINning Up LLM Training Conversations: The SPIN methodology was presented from a paper on arXiv by
_3sphere
, discussing its potential in refining LLM capabilities through iteration. -
Single-GPU LLM Inference Made Possible:
nonameusr
shared AirLLM, which enables 70B LLM inference on a single 4GB GPU as described in a Twitter post. -
Etchedās Custom Silicon Spurs Skepticism: A discussion included skepticism about the viability of Etchedās custom silicon for transformer inference, casting doubt on its practicality for LLMs.
-
Orionās 14B Model Falls Short in Conversational Skills: Orionās 14B model was reported by
teknium
and others to have subpar conversational output, contradictory to its benchmark scores. -
Proxy-Tuning Paper Sparks Interest: A new tuning approach for LLMs called proxy-tuning was discussed, which is detailed in a recently published paper.
-
Mixtralās Multi-Expert Potential: Conversations around Mixtral models focused on the successful optimization of using multiple experts, leading to contemplation of its use with Hermes by
carsonpoole
. -
Finetuning Fineries:
qnguyen3
sought advice for fine-tuning Nous Mixtral models, andteknium
provided insights, including that Nous Mixtral had undergone a complete finetune. -
Commercial Licensure Confusion: The commercial usage of finetuned models sparked a debate about licensing costs and permissions, initiated by
teknium
and engaged bycasper_ai
and others. -
Designing Nous Icons: The Nous community embarked on designing legible role icons, with suggestions for a transparent āNous Girlā and simpler logos from
benxh
andjohn0galt
. -
Omar from DSPy/ColBERT/Stanford Joins The Fray: The community welcomed Omar, expressing excitement for potential collaborations involving his contributions to semantic search and broader AI applications.
-
Alpacaās Evaluation Method Questioned:
teknium
expressed skepticism about Alpacaās leaderboard, hinting at issues with its method after observing Yi Chat ranked above GPT-4. -
Imitation Learningās Human Boundaries: A conversation led by
teknium
tackled the idea that imitation learning may not yield superhuman capacities due to reliance on average human data for training. -
AIās Self-Critiquing Abilities Challenged: A discussed paper indicated AIās lack of proficiency in self-evaluation, prompting
teknium
to question self-critiquing capabilities in models.
Nous Research AI Channel Summaries
ā· #off-topic (29 messagesš„):
- WSL1 Handles Big Models Just Fine:
@_3sphere
discovered that using WSL1, a 13B model can be loaded without issues. They initially thought otherwise due to segmentation faults occurring with the llama.mia setup but later realized this was a tool-specific fault. - Model Compatibility Oversight:
@_3sphere
reported that the ggml hook, used for handling AI models, apparently only works with 7b models, suggesting that the creator of the ggml hook might only have tested it with this specific size. There was a hint of frustration as this limitation was not documented. - Hugging Face Leaderboard Policing:
@.ben.com
shared a discussion about a recent change on the Hugging Face leaderboard where models incorrectly marked asmerge
are being flagged unless metadata is properly adjusted. - Strange New Worlds in Klingon:
@teknium
shared a YouTube video featuring a scene with Klingon singing from āStrange New Worlds Season 2 Episode 9,ā expressing dismay at the creative direction of the Star Trek franchise. - Star Trek Nostalgia Eclipsed by New Changes:
@teknium
discussed the change in direction for Star Trek with nostalgia, accompanied by a humorous gif implying disappointment, while@.benxh
lamented the changes to the beloved series.
Links mentioned:
- mistralai/Mixtral-8x7B-v0.1 Ā· Add MoE tag to Mixtral: no description found
- Gary Marcus Yann Lecun GIF - Gary Marcus Yann LeCun Lecun - Discover & Share GIFs: Click to view the GIF
- Klingon Singing: From Strange New Worlds Season 2 Episode 9.
- HuggingFaceH4/open_llm_leaderboard Ā· Announcement: Flagging merged models with incorrect metadata: no description found
ā· #interesting-links (236 messagesš„š„):
-
Exploration of Training Phases for LLMs: A discussion by
@_3sphere
on when itās effective to introduce code into the training process of LLMs led to sharing the SPIN methodology from a recent paper, which allows LLMs to refine capabilities by playing against their previous iterations. -
LLM Inference on Minimal Hardware:
@nonameusr
shared information about AirLLM, an approach allowing 70B LLM inference on a single 4GB GPU by utilizing layer-wise inference without compression techniques. -
Chipsets Specialized for LLMs: Thereās skepticism about the practicality and future-proof nature of Etchedās custom silicon for transformer inference, as mentioned by
@eas2535
,@euclaise
, and@0xsingletonly
. -
Orion-14B-Model Under Scrutiny: Orionās 14B modelās actual conversational competency is being questioned by
@.benxh
,@teknium
, and others, as its performance on benchmarks such as MMLU contrasts with initial user experiences that report nonsensical output and a tendency to lapse into random languages. -
Proxy-Tuning for LLMs: A linked paper discussed by
@intervitens
and@sherlockzoozoo
introduces proxy-tuning, which uses predictions from a smaller LM to guide the predictions of larger, potentially black-box LMs.
Links mentioned:
- Etched | The Worldās First Transformer Supercomputer: Transformers etched into silicon. By burning the transformer architecture into our chips, weāre creating the worldās most powerful servers for transformer inference.
- Tweet from undefined: no description found
- Tweet from Rohan Paul (@rohanpaul_ai): š§ Run 70B LLM Inference on a Single 4GB GPU - with airllm and layered inference š„ layer-wise inference is essentially the ādivide and conquerā approach š And this is without using quantizā¦
- Tuning Language Models by Proxy: Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become incā¦
- Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models: Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the prospect of growing a strong Lā¦
- Looped Transformers are Better at Learning Learning Algorithms: Transformers have demonstrated effectiveness in in-context solving data-fitting problems from various (latent) models, as reported by Garg et al. However, the absence of an inherent iterative structurā¦
- At Which Training Stage Does Code Data Help LLMs Reasoning?: Large Language Models (LLMs) have exhibited remarkable reasoning capabilities and become the foundation of language technologies. Inspired by the great success of code data in training LLMs, we naturaā¦
- Director of Platform: Cupertino, CA
- bartowski/internlm2-chat-20b-llama-exl2 at 6_5: no description found
- OrionStarAI/Orion-14B-Base Ā· Hugging Face: no description found
- Tweet from anton (@abacaj): Letās fking go. GPU poor technique you all are sleeping on, phi-2 extended to 8k (from 2k) w/just 2x3090s
- GitHub - b4rtaz/distributed-llama: Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.: Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage. - GitHub - b4rtaz/distributed-llama: Run LLMs on weak devices or make pā¦
- GitHub - RVC-Boss/GPT-SoVITS: 1 min voice data can also be used to train a good TTS model! (few shot voice cloning): 1 min voice data can also be used to train a good TTS model! (few shot voice cloning) - GitHub - RVC-Boss/GPT-SoVITS: 1 min voice data can also be used to train a good TTS model! (few shot voice clā¦
- Yuan2.0-2B-Janus-hf: no description found
ā· #general (524 messagesš„š„š„):
-
Fresh Perspectives on Mixtral Experts: Discussions around the use of multiple experts in Mixtral models center around optimization.
@carsonpoole
highlights a successful implementation with minimal sacrifices in speed when using a higher number of experts and contemplates trying Hermes with more than the typical two experts. -
A Quest for Quality Finetuning: Thereās a shared curiosity about fine-tuning models with more than two experts.
@qnguyen3
faces difficulties fine-tuning with Axolotl and seeks advice from veterans like@teknium
, who clarified that the Nous Mixtral model had a full finetune and not just a LoRa fine-tune. -
Licensing Quandaries Regarding Commercial Use: A discussion sparked by
@teknium
about the commercial use of finetuned models, like those from Stability AI, unveils confusion surrounding licensing costs and permissions. Different interpretations and potential issues with implementing commercial use are debated among users like@casper_ai
. -
The Nous Aesthetic: The chat includes an initiative to design more legible Nous role icons. Various suggestions, such as making a transparent version of the āNous Girlā graphic or creating a simpler logo, circulate, with members
@benxh
and@john0galt
contributing design skills. -
Tech Community Shoutouts: Omar from DSPy/ColBERT/Stanford joins the server, greeted by members
@night_w0lf
and@qnguyen3
. Members express enthusiasm for integrating Omarās work into their solutions and anticipation for a collaboration with DSPy in their projects.
Links mentioned:
- Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling: no description found
- Animated Art Gif GIF - Painting Art Masterpiece - Discover & Share GIFs: Click to view the GIF
- Combining Axes Preconditioners through Kronecker Approximation forā¦: Adaptive regularization based optimization methods such as full-matrix Adagrad which use gradient second-moment information hold significant potential for fast convergence in deep neural networkā¦
- Joongcat GIF - Joongcat - Discover & Share GIFs: Click to view the GIF
- Nerd GIF - Nerd - Discover & Share GIFs: Click to view the GIF
- Browse Fonts - Google Fonts: Making the web more beautiful, fast, and open through great typography
- Domine - Google Fonts: From the very first steps in the design process āDomineā was designed, tested and optimized for body text on the web. It shines at 14 and 16 px. And can even be
- š Semantic Search - Embedchain: no description found
- EleutherAI/pythia-12b Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- Tweet from Teknium (e/Ī») (@Teknium1): Okay, read the paper, have some notes, mostly concerns but thereās some promise. - As I said when I first saw the paper, they only tested on Alpaca Eval, which, I canāt argue is the best evalā¦
- Evaluation of Distributed Shampoo: Comparison of optimizers: Distributed Shampoo, Adam & Adafactor. Made by Boris Dayma using Weights & Biases
- Tweet from GitHub - FixTweet/FxTwitter: Fix broken Twitter/X embeds! Use multiple images, videos, polls, translations and more on Discord, Telegram and others: Fix broken Twitter/X embeds! Use multiple images, videos, polls, translations and more on Discord, Telegram and others - GitHub - FixTweet/FxTwitter: Fix broken Twitter/X embeds! Use multiple imageā¦
- HuggingFaceH4/open_llm_leaderboard Ā· Announcement: Flagging merged models with incorrect metadata: no description found
ā· #ask-about-llms (168 messagesš„š„):
-
Doubting Alpacaās Evaluation:
@teknium
expressed skepticism about Alpacaās evaluation, stating that according to the leaderboard, Yi Chat is rated higher than GPT-4, hinting at potential flaws in the evaluation process. -
Imitation Learning Limitations: In a discussion about the limitations of imitation learning,
@teknium
suggested that models are unlikely to imitate superhuman capacity if theyāre trained on data from average humans. -
Self-Critique in AI Models Questioned:
@teknium
referenced a paper indicating that AI models are not proficient at self-evaluation, raising questions about their self-critiquing abilities. -
Experimenting with LLaMA and ORCA:
@teknium
shared an experiment where LLaMA 2 70B was used to make ORCA, similar to how GPT-4 did, noting a slight improvement in MT benchmarks but a negative impact on traditional benchmarks like MMLU. -
Comparing Different Versions of LLMs: Responding to an inquiry from
@mr.userbox020
about benchmarks between Nous Mixtral and Mixtral Dolphin,@teknium
provided links to their GitHub repository with logs comparing Dolphin 2.6 with Mixtral 7x8 and Nous Hermes 2 with Mixtral 8x7B, also noting that in their experience, version 2.5 performed the best.
Links mentioned:
- š¾ LM Studio - Discover and run local LLMs: Find, download, and experiment with local LLMs
- Ollama: Get up and running with large language models, locally.
- Approximating Two-Layer Feedforward Networks for Efficient Transformers: How to reduce compute and memory requirements of neural networks (NNs) without sacrificing performance? Many recent works use sparse Mixtures of Experts (MoEs) to build resource-efficient large languaā¦
- LLM-Benchmark-Logs/benchmark-logs/Dolphin-2.6-Mixtral-7x8.md at main Ā· teknium1/LLM-Benchmark-Logs: Just a bunch of benchmark logs for different LLMs. Contribute to teknium1/LLM-Benchmark-Logs development by creating an account on GitHub.
- GitHub - ggerganov/llama.cpp: Port of Facebookās LLaMA model in C/C++: Port of Facebookās LLaMA model in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.
- LLM-Benchmark-Logs/benchmark-logs/Nous-Hermes-2-Mixtral-8x7B-DPO.md at main Ā· teknium1/LLM-Benchmark-Logs: Just a bunch of benchmark logs for different LLMs. Contribute to teknium1/LLM-Benchmark-Logs development by creating an account on GitHub.
OpenAI Discord Summary
-
Rethinking Nightshadeās Impact: Engineers debated the fail-safe mechanisms of AI, particularly with Nightshade, which may not compromise data due to its novel nature. The conversation highlighted concerns about the system affecting unintended datasets and the trust in large AI companiesā robust security measures.
-
Optimizing Prompt Limits in GPT-4: A technical discussion ensued regarding prompt lockouts in GPT-4ās image generator. Clarifications emerged on how rolling usage and individual prompt timers work, suggesting that a testing interval of one prompt every 4.5 minutes could avoid hitting the prompt cap.
-
AI Know-How for Pythonistas: Community members sought advice on deepening their AI expertise beyond intermediate Python, with suggestions including exploring fundamental AI concepts, machine learning techniques, and resources from Hugging Face.
-
A Tinge of AI Consciousness in Bing?: There were joking speculations among engineers about Bingās possible self-awareness, sparking light-hearted exchanges without serious concern over the AIās emerging capabilities.
-
Prompt Engineering: The Art of AI Guidance: The community exchanged ideas on prompt engineering, security strategies such as ātrigger/block,ā and the importance of understanding AIās interpretation of language and instructions. They debated conditional prompting, how to craft prompts to safeguard against bad actors, and considerations for securely hosting GPT instructions.
OpenAI Channel Summaries
ā· #ai-discussions (43 messagesš„):
-
Query on Nightshadeās Foolproof Nature:
@jaicraft
questioned if Nightshade is without flaws, concerned it might affect data beyond its target.@ćļ½ļ½ ļ½ļ½ļ½ļ½ļ½ļ½ļ½ļ½ļ½ ć
believes large AI companies have robust failsafes and it should be easy to isolate poisoned data due to Nightshadeās novelty. -
Prompt Limit Confusions:
@.kylux
encountered an issue with prompt limits in the image generator via GPT-4, noting a lockout after 20 messages despite a 40-message limit.@rendo1
clarified itās rolling usage with each prompt on its timer, and@satanhashtag
advised attempting one prompt every 4.5 minutes for testing. -
AI Enthusiastās Learning Path:
@.009_f.108
seeks resources for deepening knowledge of AI, already possessing intermediate Python skills.@michael_6138_97508
and@lugui
recommended starting with fundamental AI concepts and classical machine learning techniques while others like@darthgustav.
simply suggested Hugging Face. -
Bingās Alleged Self-Awareness:
@metaldrgn
claimed Bing might be exhibiting signs of intelligence and consciousness, while@michael_6138_97508
jokingly responded that they are lucky. -
Discussion on Moderation and Resource Sharing:
@miha9999
was muted for share a resources link and inquired about the policy.@eskcanta
advised contacting modmail for clarification and assistance with moderation actions, which resolved@miha9999
ās confusion after the warning was removed.
ā· #gpt-4-discussions (144 messagesš„š„):
- Integration Woes with Weaviate:
@woodenrobot
expressed difficulty integrating custom GPT action with Weaviate, highlighting anUnrecognizedKwargsError
related to object properties in the payload. - Exploring Charge Cycles for GPT-4:
@stefang6165
noticed a reduction in the limit for GPT-4 messages from 40 to about 20 every 3 hours, seeking insights on this change. - Sharing GPT-4 Chat Experience:
_jonpo
shared their satisfying conversation with HAL, while@robloxfetish
encountered an unexpected message cap during their sessions, prompting@darthgustav.
and@c27c2
to suggest it could be a temporary error or necessitate a support contact. - PDF Handling with ChatGPT:
@marx1497
asked for advice handling small PDFs with limited success, leading to a discussion with@darthgustav.
about the limitations of the tool and suggestions for pre-processing the data. - Creating Interactive MUD Environments with GPT:
@woodenrobot
and@darthgustav.
engaged in an in-depth technical exchange about embedding structured data and code into knowledge documents for GPT, with a shared interest in using AI for MUD servers and working within constraints of database storage and session continuity.
ā· #prompt-engineering (247 messagesš„š„):
-
Security Through Obscurity in GPTs:
@busybenss
suggested a ātrigger/blockā strategy to protect GPT models from bad actors.@darthgustav.
pointed out the importance of Conditional Prompting for security, encouraging open discussion over gatekeeping. -
Conditional GPT Use in Complex JSON:
@semicolondev
inquired about using GPT-4 conditionally when generating complex JSON that 3.5 struggles with, alluding to the higher cost of using GPT-4.@eskcanta
recommended using 3.5 for baseline steps and reserving GPT-4 for the steps where itās necessary, urging creative problem-solving within budget constraints. -
Extemporaneous AI Epistemology:
@darthgustav.
and@eskcanta
conducted a deep dive into how models interpret and respond to prompts. They highlighted the idiosyncrasies in AIās understanding of instructions, noting that even AI doesnāt always āknowā its reasoning path, providing significant insight into how model training could affect prompt interpretation. -
Prompting Strategies Unveiled:
@eskcanta
shared an advanced prompt strategy of separating what the model thinks from what itās instructed to do. This concept sparked conversation about the essence of understanding AI response behavior and how to exploit it for better engineering prompts. -
Chart Extractions into Google Sheets:
@alertflyer
asked for help transferring charts from GPT output into Google Sheets, to which@eskcanta
responded by clarifying the nature of the chart needed. The discussion aimed to identify the method of chart creation for proper extraction.
ā· #api-discussions (247 messagesš„š„):
-
Security Strategies in the Spotlight:
@busybenss
revealed a security method they coined as ātrigger/blockā to protect GPT from bad actors, stating it effectively prevents execution of undesired inputs by the GPT.@darthgustav
expressed interest in the amount of character space this method uses, concerned about potential loss of functionality. -
Conditional Prompting to Secure GPTs: In an in-depth discussion on security,
@darthgustav
explained the benefits of Conditional Prompting and warned about potential weaknesses in security implementation. The conversation then navigated through several techniques and ideas for securing GPTs, including hosting GPT instructions via a web server with secure calls to OpenAI. -
Hacking LLMs: An Inevitable Risk: Both
@busybenss
and@darthgustav
concurred that while security measures are essential, thereās an inherent vulnerability in sharing and using GPTs, and theft of digital assets may still occur. -
The Economics of AI Development: As the conversation shifted from security to the business side of AI,
@thepitviper
and@darthgustav
advised focusing on improving the product and marketing to stand out, rather than excessively worrying about theft and the pursuit of perfect security. -
Prompt Engineering and AI Understanding: A series of messages from
@madame_architect
,@eskcanta
, and others discussed the intricacies of prompt engineering and the AIās interpretation of language. They shared insights on semantic differences and how to guide the model to better understand and execute prompts.
LAION Discord Summary
-
Scrutinizing Adversarial AI Tools: Discussions centered around the suspect effectiveness of adversarial tools like Nightshade and Glaze on AI image generation. While
@astropulse
raised concerns over a false sense of security they might offer, no consensus was reached. A relevant Reddit post offers further insight. -
Data and Models, A Heated Debate: Members engaged in a rich debate on creating datasets for fine-tuning AI models and the challenges associated with high-resolution images. Talks also included the efficacy and cost of models like GPT-4V, and the complexities in scaling T5 models compared to CLIP models.
-
Ethical AI, A Thorny Issue: AI ethics and copyright were another focal point, with community members displaying a level of cynicism about what constitutes āethicsā. The discordancy in community reactions on platforms such as Hacker News and Reddit highlighted the paradoxical nature of AIās influence on copyright.
-
The Future of Text-to-Speech: Advances in TTS sparked lively discussions, comparing various services including WhisperSpeech and XTTS. The impressive dubbing technology by 11Labs was discussed but is restricted due to API limitations. A relevant YouTube video opens up on TTS developments.
-
Inquiries and Theories on Emotional AI:
- Legality and Challenges: Questions about the EUās stance on emotion-detecting AI led to a clarification that such technology is not banned for research within the EU.
- Need for Experts in Emotion Detection: There were calls for expert involvement in building emotion detection datasets, with emphasis on the need for psychological expertise and appropriate context for accurate emotion classification.
LAION Channel Summaries
ā· #general (394 messagesš„š„):
-
Debating Nightshadeās Effectiveness:
@mfcool
expressed hope that DreamShaperXL Turbo images werenāt from a new model, citing their similarity to existing ones.@astropulse
and others delved into the intricacies of whether adversarial tools like Nightshade and Glaze significantly impact AI image generation, with@astropulse
suggesting they might provide users with a false sense of security. Hereās a deep dive from ther/aiwars
subreddit: We need to talk a little bit about Glaze and Nightshadeā¦. -
Discussions on Data and Model Training: Members like
@chad_in_the_house
,@thejonasbrothers
, and@pseudoterminalx
spoke about creating datasets for fine-tuning models and the limitations of using images with high resolution. The debate touched on the efficacy and cost of models like GPT-4V and the complexity of scaling T5 models relative to CLIP models. -
AI Ethics and Licensing Discourse: The conversation extended to AI copyrights and ethics, with members expressing cynicism about contemporary āethicsā being a stand-in for personal agreement.
@astropulse
and@.undeleted
critiqued the community reactions on platforms like Hacker News and Reddit, while discussing the broader implications of AI on art and copyright. -
Exploring TTS and Dubbing Technologies:
@SegmentationFault
,@itali4no
, and@.undeleted
discussed advanced text-to-speech (TTS) models, comparing existing services like WhisperSpeech and XTTS.@SegmentationFault
highlighted 11Labsā impressive dubbing technology and the API restrictions that keep their methods proprietary. Find out more about TTS developments in this Youtube video: āOpen Source Text-To-Speech Projects: WhisperSpeechā. -
Inquiries about AI Upscaler and Language Model Training:
@skyler_14
asked about the status of training the GigaGAN upscaler, referring to a GitHub project by@lucidrains
.@andystv_
inquired about the possibility of training a model for Traditional Chinese language support.
Links mentioned:
- no title found: no description found
- apf1/datafilteringnetworks_2b Ā· Datasets at Hugging Face: no description found
- Data Poisoning Wonāt Save You From Facial Recognition: Data poisoning has been proposed as a compelling defense against facial recognition models trained on Web-scraped pictures. Users can perturb images they post online, so that models will misclassify fā¦
- WhisperSpeech - a Hugging Face Space by Tonic: no description found
- Meme Our GIF - Meme Our Now - Discover & Share GIFs: Click to view the GIF
- Reddit - Dive into anything: no description found
- Reddit - Dive into anything: no description found
- Open Source Text-To-Speech Projects: WhisperSpeech - In Depth Discussion: WhisperSpeech is a promising new open source TTS model, that and be training on AUDIO ONLY data & that already shows promising results after a few hundred GPā¦
- Is webdataset a viable format for general-use ? Ā· huggingface/pytorch-image-models Ā· Discussion #1524: Hi @rwightman , thanks for the continuous good work. I am playing a bit with the Webdataset format, utilizing some of the methods in: https://github.com/rwightman/pytorch-image-models/blob/475ecdfaā¦
- GitHub - lucidrains/gigagan-pytorch: Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs: Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs - GitHub - lucidrains/gigagan-pytorch: Implementation of GigaGAN, new SOTA GAN out of Adobā¦
ā· #research (25 messagesš„):
-
Computational Challenges in Model Scaling:
@twoabove
discussed that authors of a recent model confessed to being compute-constrained and they are planning to look into the scaling laws for their method.@qwerty_qwer
responded, noting that overcoming compute constraints would be game-changing. -
In Search of Novel Multimodal Techniques:
@twoabove
inquired about innovative image chunking/embedding techniques for use in multimodal models, a question further expounded upon by@top_walk_town
who listed several methods including LLaVa, Flamingo, llama adapter, Chameleon, and the megabyte paper approaches. -
Unpacking EU AI Laws on Emotional AI:
@fredipy
questioned whether creating AI that detects emotions contradicts EU AI regulations.@mr_seeker
clarified and@JH
opined that such laws do not impact non-European entities, while@spirit_from_germany
stated that emotion detection is not banned for research in the EU. -
Challenges in Emotional Recognition Datasets:
@spirit_from_germany
is working on an image-based emotion detector but struggles with limited emotional datasets. They proposed creating a curated dataset with the help of psychological experts, and@_spaniard_
expressed skepticism about the feasibility of detecting nuanced emotions without rich contextual information. -
Expert Insights Needed for Emotion Detection:
@.hibarin
from a psychological background supported the need for context in emotion classification, aligning with either the fingerprints or population hypotheses of emotion.@skyler_14
introduced 3D morphable models as a potential domain for easier emotion annotation.
Eleuther Discord Summary
-
Flash Attention Sparks CUDA vs XLA Debate:
@carsonpoole
and@.the_alt_man
debated about Flash Attention with opinions split on whether XLA optimizations could simplify its CUDA implementations. A Reddit comment from Patrick Kidger suggested that XLA can optimize attention mechanisms on TPUs, referencing a Reddit thread. -
Legal Conundrums Over Adversarial Methods: The Glaze and Nightshade tools sparked a legal and effectiveness debate among members like
@digthatdata
and@stellaathena
. A legal paper was shared to illustrate that bypassing a watermark is not necessarily a legal violation. -
Open Source and AI Ethics: The community discussed the open-source nature and licensing of Metaās LLaMA, with
@avi.ai
referring to a critical write-up by the OSI, highlighting that LLaMAās license does not meet the open-source definition (OSI blog post). The conversation veered towards governance in AI and a call to build models with open-source software principles, as discussed by Colin Raffel (Stanford Seminar Talk). -
Explorations in Class-Incremental Learning and Optimization: SEED, a method for finetuning MoE models, was introduced with a research paper shared, and discussions around the CASPR optimization technique emerged as a contender outperforming the Shampoo algorithm, backed by a research paper. Also, a paper claiming zero pipeline bubbles in distributed training was mentioned, offering new synchronization bypass techniques during optimizer steps (Research Paper).
-
Unlocking Machine Interpretability with Patchscopes: Conversations revolved around the new framework Patchscopes for decoding information from model representations, where
@stellaathena
shared a Twitter thread introducing the concept. There was a sense of cautious optimism about its application in information extraction, tempered by concerns around hallucinations in multi-token generation. -
Apex Repository Update and NeoX Development: An update in NVIDIAās apex repository was highlighted by
@catboy_slim_
for potentially speeding up the build process for GPT-NeoX, recommending a branch ready for testing (NVIDIA Apex Commit).
Eleuther Channel Summaries
ā· #general (213 messagesš„š„):
-
Debating āFlash Attentionā and XLA Optimizations: In a technical debate,
@carsonpoole
and@.the_alt_man
discussed the implementation of Flash Attention, with@carsonpoole
asserting it involves complex CUDA operations and@.the_alt_man
suggesting that XLA optimizations could automate much of its efficiency.@lucaslingle
and@.the_alt_man
later shared Patrick Kidgerās comment from Reddit indicating XLAās existing compiler optimizations for attention mechanisms on TPUs. -
Glaze & Nightshade Legalities: Users
@digthatdata
,@stellaathena
,@clockrelativity2003
, and others discussed the legal aspects and effectiveness of Glaze and Nightshade, with conflicting views on whether these tools represent a form of encryption or watermarking.@stellaathena
shared a legal paper stating that bypassing a watermark is likely not a violation of law, while other users examined both the practical and legal implications of combating AI image models with adversarial methods. -
Adversarial Perturbations & The Feasibility of OpenAI Lobbying: In the midst of discussing Nightshadeās impacts and the concept of adversarial perturbations,
@avi.ai
underlined the challenges of U.S. regulation change, responding to suggestions by@clockrelativity2003
and@baber_
regarding policies and special interests. -
Assessments of LLaMA Licensing and Open Source Definitions: In exploring the licensing of Metaās LLaMA models,
@avi.ai
provided a link to a write-up by the OSI criticizing Metaās claim of LLaMA being āopen source.ā@clockrelativity2003
and@catboy_slim_
discussed the limitations of such licenses and@avi.ai
emphasized their goal to reach the benefits seen in traditional OSS communities with AI. -
Discussion on OpenAI and the Future of ML Models: Newcomers
@AxeI
and@abi.voll
introduced themselves with academic backgrounds looking to contribute to the open-source community, while@exirae
sought advice on pitching a novel alignment project.@hailey_schoelkopf
and@nostalgiahurts
highlighted resources and talks by Colin Raffel regarding the building of AI models with an open-source ethos.
Links mentioned:
- Tweet from neil turkewitz (@neilturkewitz): @alexjc FYIāI donāt think thatās the case. Glaze & Nightshade donāt control access to a work as contemplated by §1201. Howeverāas you note, providing services to circumvent them might well indeed violā¦
- A Call to Build Models Like We Build Open-Source Software: no description found
- Reddit - Dive into anything): no description found
- nyanko7/LLaMA-65B Ā· š© Report : Legal issue(s): no description found
- stabilityai/sdxl-turbo Ā· Hugging Face: no description found
- Reddit - Dive into anything: no description found
- Taking stock of open(ish) machine learning / 2023-06-15: Iāve been writing this newsletter for about six months, so I thought it might be a good time to pause the news firehose, and instead review and synthesize what Iāve learned about the potential for opeā¦
- Metaās LLaMa 2 license is not Open Source: Meta is lowering barriers for access to powerful AI systems, but unfortunately, Meta has created the misunderstanding that LLaMa 2 is āopen sourceā - it is not.
- Tweet from Luca Bertuzzi (@BertuzLuca): #AIAct: the technical work on the text is finally over. Now comes the ungrateful task of cleaning up the text, which should be ready in the coming hours.
- Building ML Models like Open-Source Software - Colin Raffel | Stanford MLSys #72: Episode 72 of the Stanford MLSys Seminar āFoundation Models Limited Seriesā!Speaker: Colin RaffelTitle: Building Machine Learning Models like Open-Source Sofā¦
- Tweet from Shawn Presser (@theshawwn): Facebook is aggressively going after LLaMA repos with DMCAās. llama-dl was taken down, but that was just the beginning. Theyāve knocked offline a few alpaca repos, and maintainers are making tā¦
- Glazeās plagiarism is hilarious and indefensible: Posted in r/StableDiffusion by u/AloneSignificance555 ⢠46 points and 48 comments
- Pallas implementation of attention doesnāt work on CloudTPU Ā· Issue #18590 Ā· google/jax: Description import jax import jax.numpy as jnp from jax.experimental.pallas.ops import attention bs = 2 seqlen = 1000 n_heads = 32 dim = 128 rng = jax.random.PRNGKey(0) xq = jax.random.normal(rng, ā¦
- Glazeās plagiarism is hilarious and indefensible: Posted in r/StableDiffusion by u/AloneSignificance555 ⢠45 points and 48 comments
- The Mirage of Open-Source AI: Analyzing Metaās Llama 2 Release Strategy ā Open Future: In this analysis, I review the Llama 2 release strategy and show its non-compliance with the open-source standard. Furthermore, I explain how this case demonstrates the need for more robust governanceā¦
- Reddit - Dive into anything: no description found
ā· #research (89 messagesš„š„):
-
SEED Approach for Class-Incremental Learning:
@xylthixlm
provided a link to a paper on arXiv about SEED, a method for finetuning Mixture of Experts (MoE) models by freezing all experts but one for each new task. This specialization is expected to enhance model performance Research Paper. -
Backdoor Attacks on LLMs through Poisoning and CoT:
@ln271828
gave a TL;DR of a research paper indicating that a new backdoor attack on large language models (LLMs) can be enhanced via chain-of-thought (CoT) prompting, while current techniques like supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) are ineffective against these attacks Research Paper. -
Combining AxeS PReconditioners (CASPR) Optimization Technique:
@clashluke
discussed a paper on CASPR, an optimization method that outperforms the Shampoo algorithm by finding different preconditioners for each axis of the matrix-shaped neural network parameters Research Paper. -
Zero Pipeline Bubbles in Distributed Training:
@pizza_joe
shared a paper that introduces a scheduling strategy claiming to be the first to achieve zero pipeline bubbles in large-scale distributed synchronous training, with a novel technique to bypass synchronizations during the optimizer step Research Paper. -
Generality in Depth-Conditioned Image Generation with LooseControl:
@digthatdata
linked a GitHub repository and paper for LooseControl, which generalizes depth conditioning for diffusion-based image generation, allowing creation and editing of complex scenes with minimal guidance GitHub Repo, Paper Page, Tweet Discussion.
Links mentioned:
- Stabilizing Transformer Training by Preventing Attention Entropy Collapse: Training stability is of great importance to Transformers. In this work, we investigate the training dynamics of Transformers by examining the evolution of the attention layers. In particular, we tracā¦
- Analyzing and Improving the Training Dynamics of Diffusion Models: Diffusion models currently dominate the field of data-driven image synthesis with their unparalleled scaling to large datasets. In this paper, we identify and rectify several causes for uneven and ineā¦
- Divide and not forget: Ensemble of selectively trained experts in Continual Learning: Class-incremental learning is becoming more popular as it helps models widen their applicability while not forgetting what they already know. A trend in this area is to use a mixture-of-expert techniqā¦
- Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. ā¦
- Combining Axes Preconditioners through Kronecker Approximation forā¦: Adaptive regularization based optimization methods such as full-matrix Adagrad which use gradient second-moment information hold significant potential for fast convergence in deep neural networkā¦
- Zero Bubble Pipeline Parallelism: Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable. In this work, we introduce a scheā¦
- no title found: no description found
- Tweet from Shariq Farooq (@shariq_farooq): @ak LooseControl can prove to be a new way to design complex scenes and perform semantic editing e.g. Model understands how lighting changes with the edits: (2/2)
- memory-transformer-pt4/src/optimizer/spectra.py at main Ā· Avelina9X/memory-transformer-pt4: Contribute to Avelina9X/memory-transformer-pt4 development by creating an account on GitHub.
- Tweet from AK (@_akhaliq): LooseControl: Lifting ControlNet for Generalized Depth Conditioning paper page: https://huggingface.co/papers/2312.03079 present LooseControl to allow generalized depth conditioning for diffusion-baā¦
- GitHub - shariqfarooq123/LooseControl: Lifting ControlNet for Generalized Depth Conditioning: Lifting ControlNet for Generalized Depth Conditioning - GitHub - shariqfarooq123/LooseControl: Lifting ControlNet for Generalized Depth Conditioning
- arXiv user login: no description found
- Add freeze_spectral_norm option Ā· d8ahazard/sd_dreambooth_extension@573d1c9: See https://arxiv.org/abs/2303.06296 This adds an option to reparametrize the model weights using the spectral norm so that the overall norm of each weight can't change. This helps to stabiliā¦
- d8ahazard - Overview: d8ahazard has 171 repositories available. Follow their code on GitHub.
ā· #interpretability-general (9 messagesš„):
- Seeking Interpretability Resources: User
@1_glados
expressed they are new to interpretability and looking for good resources or a list of papers to start with, while@neelnanda
inquired about the use of sparse autoencoders in initial NLP interpretability research. - Sparse Autoencoders in NLP History: User
@nsaphra
discussed the recurring themes in sparse dictionary learning, spanning from the latent semantic allocation era to the present, noting the inconsistent citations of predecessors and challenging the meaningfulness of a definition of mechanistic interpretability that includes such approaches. - Introducing Patchscopes for Representation Decoding:
@stellaathena
shared a Twitter thread by @ghandeharioun that introduces Patchscopes, a framework for decoding specific information from a modelās representations. - Learning Dynamics for Interpretability Questioned: Responding to its relevance,
@stellaathena
also questioned whether scoring high on next-token prediction with Patchscopes indeed correlates with identifying a modelās best guess as to the answer after a certain layer, implying that higher performance might not equate to better understanding. - Potential and Concerns of Patchscopes: User
@mrgonao
sees significant potential in using Patchscopes for information extraction from hidden states in models like RWKV and Mamba, but also voiced concerns about potential hallucinations and the need for robustness checks in multi-token generation.
Links mentioned:
Tweet from Asma Ghandeharioun (@ghandeharioun): š§µCan we āaskā an LLM to ātranslateā its own hidden representations into natural language? We propose š©ŗPatchscopes, a new framework for decoding specific information from a representation by āpatchinā¦
ā· #gpt-neox-dev (1 messages):
- NVIDIAās Apex Update Could Speed Up NeoX Build:
@catboy_slim_
highlighted a commit from NVIDIAās apex repository, noting the need to fork and trim the code to accelerate the build process for fused adamw, as currently the full build takes about half an hour. They suggested that, despite the build time increase, the updated branch is likely ready for testing as it works on their machine.
Links mentioned:
Squashed commit of https://github.com/NVIDIA/apex/pull/1582 Ā· NVIDIA/apex@bae1f93: commit 0da3ffb92ee6fbe5336602f0e3989db1cd16f880 Author: Masaki Kozuki <[email protected]> Date: Sat Feb 11 21:38:39 2023 -0800 use nvfuser_codegen
commit 7642c1c7d30de439feb35ā¦
LM Studio Discord Summary
-
LM Studioās Range of Support and Future Improvements: Discussions centered on LM Studioās capabilities and limitations, where
@heyitsyorkie
clarified that GGUF quant models from Huggingface are supported but management of loading and unloading models should be done manually. Image generation is out of scope for LM Studio, with users directed towards Stable Diffusion for such tasks. Compatibility issues such as lacking support for CPUs without AVX instructions were noted, and a potential future update may include Intel Mac support which is currently not offered. Users experiencing persistent errors after reinstalling Windows were directed to a Discord link for troubleshooting assistance. -
The Great GPU Discussion: Conversations in hardware discussion heated up with talks of investing in high-performance Nvidia 6000 series cards and awaiting hardware upgrades like the P40 card. Comparisons were made between Nvidia RTX 6000 Ada Generation cards and cost-effective alternatives for Large Language Model (LLM) tasks. Mac Studios are favored over PCs by some for better memory bandwidth, while others appreciate Macās cache architecture beneficial for LLM work. A debate over Nvidia card compatibility and GPU utilization also ensued, with suggestions provided for maximizing GPU performance.
-
Model-Focused Dialogues Reveal Community Preferences: In model-related chats,
@dagbs
clarified terms such as āDolphin 2.7ā and āSynthiaā as finetuners, and directed those interested in comparisons towards specific Dolphin-based models on various platforms. GGUF formatted models were highlighted for their popularity and compatibility, and models best suited for specific hardware were recommended, such as Deepseek coder 6.7B for an RTX 3060 mobile. Moreover, the efficacy of models was debated with@.ben.com
advocating for consideration of model performance beyond leaderboard scores. -
Beta Releases Beckon Feedback for Fixes: The latest windows beta reported issues with VRAM capacity displays, which is particularly relevant for models like the 6600XT AMD card where OpenCL issues were identified. Beta releases V5/V6 aimed to fix RAM/VRAM estimates bugs, and the community was solicited for feedback. ARM support queries for beta installations on a Jetson NVIDIA board were addressed, confirming current support limitations to Mac Silicon. The rapid speed improvements in the latest update sparked discussions, with
@yagilb
sharing a Magic GIF in a lighthearted response. -
CrewAI Over Autogen in Automation Showdown: A preference for crewAI was expressed by
@MagicJim
, especially for the potential to integrate multiple LLMs in LM Studio. Contrary to previous thoughts, it was clarified that crewAI does indeed allow for diverse LLM usage for each agent, with a YouTube video provided as a demonstration. A workaround for multiple LLM API instances using different ports was discussed, addressing utilization concerns. -
Emerging Tools and Integrations Enhance Capabilities:
@happy_dood
showcased how LM Studio and LangChain can be used concurrently, detailing a process involving creation, templating, and parsing for streamlined AI interactions. On the code front, experimenting with models like DeepseekCoder33B for open interpreter tasks surfaced, with evaluations suggesting better performance might be achieved with models more focused on coding.
LM Studio Channel Summaries
ā· #š¬-general (122 messagesš„š„):
-
Clarification on GGUF and Quant Models:
@heyitsyorkie
clarified that LM Studio only supports GGUF quant models from Huggingface and advised@ubersuperboss
that model loading and unloading have to be manually done within LMStudio. They also discussed that LMStudio is not suitable for image generation and directed users towards Stable Diffusion for such tasks. -
Image Generation Models Query:
@misc_user_01
inquired about the possibility of LM Studio adding support for image generation models, to which@heyitsyorkie
replied that it isnāt in scope for LMStudio, as they serve different use cases. However, they did point to Stable Diffusion + automatic1111 for users interested in image generation. -
LM Studio Support and Installation Discussions: Various users including
@cyberbug_scalp
,@ariss6556
, and@__vanj__
discussed technical issues and queries regarding system compatibility and installation of LM Studio, with@heyitsyorkie
and others offering technical advice, such as LM Studioās lack of support for CPUs without AVX1/2 instructions. -
Model Recommendations and GPU Advice:
@heyitsyorkie
answered several questions related to model suggestions for specific hardware setups like for@drhafezzz
ās M1 Air, and confirmed that LM Studio supports multi-GPU setups, recommending matching pairs for optimal performance. -
Interest in Intel Mac Support Expressed: Users
@kujila
and@katy.the.kat
expressed their desire for LM Studio to support Intel Macs, which@yagilb
acknowledged is not currently supported due to the focus on Silicon Macs but mentioned there are plans to enable support in the future.
Links mentioned:
- HuggingChat: no description found
- GitHub - comfyanonymous/ComfyUI: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. - GitHub - comfyanonymous/ComfyUI: The most powerful and modular stable diffusion GUI, api and backā¦
- ggml : add Flash Attention by ggerganov Ā· Pull Request #5021 Ā· ggerganov/llama.cpp: ref #3365 Setting up whatās needed for adding Flash Attention support to ggml and llama.cpp The proposed operator performs: // unfused kq = ggml_mul_mat (ctx, k, q); kq = ggml_scale (ctx, kq,ā¦
ā· #š¤-models-discussion-chat (82 messagesš„š„):
-
Model Confusion Cleared Up:
@dagbs
clarified that the terms like āDolphin 2.7ā, āSynthiaā, and āNous-Hermesā refer to different finetuners, which are combinations of models and datasets to create new models. This response was in aid of confusion from@lonfus
. -
Where to Find Model Comparisons: In response to
@lonfus
requesting model comparisons,@dagbs
directed them to previous posts in channel <#1185646847721742336> for personal model recommendations and provided links to Dolphin-based models that he recommends, including Dolphin 2.7 Mixtral and MegaDolphin 120B. -
GGUF Format Gains Popularity: A series of messages from
@conic
,@kadeshar
,@jayjay70
, and others discussed various places to find GGUF formatted models, including Hugging Face, LLM Explorer, and GitHub, highlighting its widespread adoption for model compatibility. -
Resource-Specific Model Recommendations: Users, including
@heyitsyorkie
and@ptable
, recommended models suitable for various hardware specsāfor instance, Deepseek coder 6.7B was suggested for an RTX 3060 mobile with 32GB RAM, and models under 70B parameters for a system with Ryzen 9 5950x and a 3090Fe GPU. -
Discussions on Model Efficacy and Performance:
@.ben.com
provided insights on model performance being potentially misleading with leaderboard scores and suggested consulting spaces like Mike Ravkineās AI coding results for more realistic appraisals. They further noted the high cost-effectiveness of using GPT-4 Turbo over procuring new hardware for running large models.
Links mentioned:
- lodrick-the-lafted/Grafted-Titanic-Dolphin-2x120B Ā· Hugging Face: no description found
- Can Ai Code Results - a Hugging Face Space by mike-ravkine: no description found
- LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys: no description found
- Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4: no description found
- Best Open-Source Language Models, All Large Language Models: no description found
- yunconglong/Truthful_DPO_TomGrc_FusionNet_7Bx2_MoE_13B Ā· Hugging Face: no description found
- nous-hermes-2-34b-2.16bpw.gguf Ā· ikawrakow/various-2bit-sota-gguf at main: no description found
- dagbs/TinyDolphin-2.8-1.1b-GGUF Ā· Hugging Face: no description found
- google/t5-v1_1-xxl Ā· Hugging Face: no description found
- TheBloke/deepseek-coder-6.7B-instruct-GGUF Ā· Hugging Face: no description found
- GitHub - lmstudio-ai/model-catalog: A collection of standardized JSON descriptors for Large Language Model (LLM) files.: A collection of standardized JSON descriptors for Large Language Model (LLM) files. - GitHub - lmstudio-ai/model-catalog: A collection of standardized JSON descriptors for Large Language Model (LLMā¦
- TheBloke (Tom Jobbins): no description found
ā· #š§ -feedback (5 messages):
- Identifying Recurrent LM Download Failures:
@leo_lion_king
suggested that failed LM downloads should be automatically deleted and marked to prevent re-downloading faulty models since users only discover errors after attempting to load them. - Unknown Model Error Triggers Inquiry:
@tobyleung.
posted a detailed JSON error output indicating an unknown error and suggesting to check if thereās enough available memory to load the model. It included details about RAM, GPU, OS, and the application used. - Reinstallation Doesnāt Clear Error: In a follow-up,
@tobyleung.
expressed confusion over persisting errors despite reinstalling Windows. - Discord Link for Error Investigation:
@dagbs
provided a Discord link that apparently explains the cause of the error but no additional context was given. - Request for Retrieval of Old Model: After discussing error issues,
@tobyleung.
asked if it would be possible to revert to their old model.
ā· #š-hardware-discussion (48 messagesš„):
- Graphics Card Strategy Evaluations:
@gtgb
was convinced to invest in a high-performance Nvidia 6000 series card after seeing Mervinās performance videos, prompting dialogue on card compatibility and choices for model execution rigs. - Awaiting Hardware Upgrades:
@pefortin
mentioned they are waiting for a P40 card, indicating a āpoor manās rig,ā to which@doderlein
replied they are expecting the same hardware arrival soon. - Powerful Cards Stimulate Envy:
@doderlein
acknowledged the significant capabilities of the Nvidia RTX 6000 Ada Generation card shared by@gtgb
in the product page link, emphasizing its high cost. - Mac Versus PC for LLMs: A debate over hardware choices surfaced, with
@heyitsyorkie
favoring a Mac Studio over PC solutions for LLM tasks due to better memory bandwidth and a more attractive home setup, while@.ben.com
pointed out the benefits of Macās cache architecture for such work. - GPU Utilization Discussions:
@omgitsprovidence
inquired about low GPU utilization,@heyitsyorkie
advised trying the ROCm beta for better AMD performance, and@dagbs
offered@misangenius
guidance on maximizing GPU offload for better response times when running models.
Links mentioned:
NVIDIA RTX 6000 Ada Generation Graphics Card: Powered by the NVIDIA Ada Lovelace Architecture.
ā· #š§Ŗ-beta-releases-chat (29 messagesš„):
-
VRAM Vanishes in Beta:
@eimiieee
reported the latest windows beta shows estimated VRAM capacity as 0 on a 6600XT AMD card.@yagilb
suggested there were issues with OpenCL in the latest beta and pointed toward trying the AMD ROCm beta. -
VRAM Estimate Bug Squashed:
@yagilb
announced Beta V5/V6, which fixed several bugs, and asked for feedback on RAM/VRAM estimates on the search page, hinting at tweaks in the calculation. -
Compatibility Queries for Jetson NVIDIA:
@quantman74
inquired about arm64 architecture support for installing the beta on a Jetson NVIDIA board.@heyitsyorkie
clarified there was no ARM support outside of Mac Silicon, and@yagilb
encouraged the creation of a feature request for it. -
Speedy Improvements Spark Curiosity:
@mmonir
commented on the doubled speed in the latest update, prompting@heyitsyorkie
to link a humorous gif, while@n8programs
also expressed curiosity about the changes that led to the speed improvements. -
Case Sensitivity Causes Model Mayhem:
@M1917Enfield
discovered and solved a problem where model folders with different case sensitivities were not being detected by LM Studio by renaming the folder to match the expected case.@yagilb
acknowledged the successful problem-solving.
Links mentioned:
Magic GIF - Magic - Discover & Share GIFs: Click to view the GIF
ā· #autogen (1 messages):
meadyfricked: Never got autogen working with LM Studio but crew-ai seems to work.
ā· #langchain (1 messages):
- LangChain Integration with LM Studio:
@happy_dood
provided an example of how LM Studio and LangChain can be used together, showcasing new class implementations. The code snippet demonstrates the creation of a ChatOpenAI instance, crafting a prompt with ChatPromptTemplate, parsing output with StrOutputParser, and combining these elements in a streamlined process.
ā· #crew-ai (10 messagesš„):
- MagicJim Weighs in on Automation Tools:
@MagicJim
shared his preference for crewAI over autogen due to the idea of integrating multiple LLMs in LM Studio. He suggested that using specific models like deepseek coder for coder agents would be beneficial. - Discussing Autogenās Flexibility with LLMs:
@sitic
observed that autogen allows using a different LLM for each agent, unlike crewAI, which seems to only use one. This feature is important for creating agents with distinct capabilities. - Clarification on crewAIās LLM Usage:
@MagicJim
clarified that crewAI does allow using different LLMs for each agent and shared a YouTube video demonstrating this functionality. - Running Multiple Instances of LLMs:
@senecalouck
suggested the workaround of running multiple instances of LLMs if the hardware supports it, using different ports for the API. - Integration Issues with LM Studio:
@motocycle
inquired if anyone had successfully integrated crewAI with the LM Studio endpoint, mentioning success with ollama but facing issues with LM Studio.
Links mentioned:
CrewAI: AI-Powered Blogging Agents using LM Studio, Ollama, JanAI & TextGen: š Welcome to an exciting journey into the world of AI-powered blogging! šIn todayās video, I take you through a comprehensive tutorial on using Crew AI to ā¦
ā· #open-interpreter (7 messages):
- Parsing Error in
system_key.go
:@gustavo_60030
noted an error insystem_key.go
where the system could not determine NFS usage. The error message mentioned an inability to parse/etc/fstab
, specifically the dump frequency, which said āinformation.ā - Model Experiments for Open Interpreter:
@pefortin
discussed experimenting with DeepseekCoder33B for open interpreter and mentioned that while Mixtral 8x7B instruct 5BPW is performing okay, itās struggling with identifying when to write code. - Model Recommendation Request: Seeking a model suited for coding tasks,
@pefortin
expressed an interest in trying out models that are focused on coding, like wizard, etc. - Model Comparison for Coding:
@impulse749
inquired if DeepseekCoder33B is the best for coding tasks, to which another offered that deepseek-coder-6.7b-instruct might be a faster and more focused option for solely coding-related tasks.
Mistral Discord Summary
-
French Language Support Sparks Interest: Users suggested the addition of a French support channel within the Mistral Discord community, reflecting a demand for multilingual assistance.
-
Data Extraction Strategies and Pricing Discussions: There was an exchange of strategies for data extraction such as using BNF grammar and in-context learning, alongside inquiries about Mistralās pricing model where it was clarified that 1M tokens correspond to 1,000,000 tokens, including both input and output.
-
Interfacing AI with 3D Animation and Function Calling: Questions arose about integrating Mistral AI with 3D characters for real-time interaction, discussing complexities like animation rigging and API compatibility, as well as implementation queries about function calling akin to OpenAIās APIs.
-
Hosting and Deployment Insights for Mistral: Users shared resources such as partITech/php-mistral on GitHub for running MistralAi with Laravel, and experiences regarding VPS hosting, on-premises hosting, and using Skypilot for Lambda Labs. Additionally, using Docker for Mistral deployment was suggested.
-
Focusing on Fine-Tuning and Model Use Cases: Conversations revolved around fine-tuning strategies such as creating datasets in Q&A JSON format, the importance of data quality with āgarbage in, garbage outā, and troubleshooting Mistral fine-tuning with tools like axolotl. Concerns were also voiced about introducing a tool highly optimized for French language tasks within the Mistral suite.
Mistral Channel Summaries
ā· #general (154 messagesš„š„):
-
Demand for a French Support Channel: User
@gbourdin
expressed that the Mistral Discord could benefit from a French support channel (Ƨa manque de channel FR
), which elicited agreement from another user,@aceknr
. -
Quest for Data Extraction Strategies:
@gbourdin
sought advice on strategies for extracting data, like postal codes or product searches, from discussions. Whereas@mrdragonfox
proposed using BNF grammar and in-context learning due to limited API support for this use case. -
Clarification on Mistral Pricing Model:
@nozarano
asked for clarification on the pricing for āmistral-medium,ā with explanation provided by@ethux
and@mrdragonfox
, defining that 1M tokens represent 1,000,000 and that both input and output tokens count towards pricing. -
AI-Driven 3D Character Interaction: User
@madnomad4540
inquired about integrating Mistral AI with a 3D character and real-time user interaction.@mrdragonfox
indicated the challenges and separated aspects involved in the venture, such as animation rigging and integrating with APIs like Google Cloud Vision. -
Exploring Assistants API and Function Calling: User
@takezo07
queried about the implementation of function calling and threads like OpenAIās Assistants APIs, while@i_am_dom
noted that such functionality could be programmed using the API directly, and@.elekt
mentioned that official support for function calling isnāt available in Mistral API.
Links mentioned:
- Vulkan Implementation by 0cc4m Ā· Pull Request #2059 Ā· ggerganov/llama.cpp: Iāve been working on this for a while. Vulkan requires a lot of boiler plate, but it also gives you a lot of control. The intention is to eventually supercede the OpenCL backend as the primary widā¦
- Vulkan Backend from Nomic Ā· Issue #2033 Ā· jmorganca/ollama: https://github.com/nomic-ai/llama.cpp GPT4All runs Mistral and Mixtral q4 models over 10x faster on my 6600M GPU
ā· #models (5 messages):
-
Seeking Fiction-Guidance with Instruct:
dizzytornado
inquired whether Instruct has guardrails specifically for writing fiction. The context and responses are not provided in the chat logs. -
A Shoutout to Mistral:
thenetrunna
expressed affection for Mistral without further context or elaboration. -
Demand for French-Optimized Mistral:
luc312
asked if there is a version of Mistral more optimized for reading/writing French or if using a strong system prompt is the only way to guide Mistral to communicate in French. -
Clarification on Multilingual Model Capabilities:
tom_lrd
clarified that tiny-7b isnāt officially built for French, having limited French abilities due to lack of targeted training, whereas Small-8x7b is officially multilingual and trained to speak French.
ā· #deployment (6 messages):
- Integrating Mistral with PHP:
@gbourdin
provided a useful resource with a link to GitHub - partITech/php-mistral, indicating that it can be used to run MistralAi with Laravel. - Seeking VPS Hosting Details:
@ivandjukic
inquired about hosting providers for VPS with a proper GPU, noting the expense or misunderstanding regarding the cost. - Client Data Secured with On-premises Hosting:
@mrdragonfox
assured that when Mistral is hosted in the clientās data center, Mistral would never get access to your data. - Hobbyist Hosting Insights:
@vhariational
shared personal experience as a hobbyist not needing the biggest GPUs, and recommends using Lambda Labs via Skypilot for occasional testing of larger models. - Suggestion for Docker Deployment:
@mrdomoo
suggested setting up a Docker server and using the python client for Mistral deployment.
Links mentioned:
GitHub - partITech/php-mistral: MistralAi php client: MistralAi php client. Contribute to partITech/php-mistral development by creating an account on GitHub.
ā· #ref-implem (2 messages):
-
Quest for Ideal Table Format in Mistral:
@fredmolinamlgcp
inquired about the best way to format table data when using Mistral. They contrasted the pipe-separated format used for models like bison, unicron, and gemini with a ātextifiedā approach theyāve been taking with Mistral by converting pandas dataframe rows into a string of headers and values. -
Sample Textified Table Prompt Provided:
@fredmolinamlgcp
shared an example of a ātextifiedā table prompt for Mistral. They demonstrated how they structure the input by including an instructional tag followed by neatly formatted campaign data (e.g., campaign id 1193, campaign name Launch Eventā¦).
ā· #finetuning (51 messagesš„):
- GPT-3 Costs and Alternatives for Data Extraction:
@cheshireai
mentioned using GPT-turbo 16k for extracting data from PDFs and creating a dataset, though they had to discard many bad results due to the large volume of documents processed. - Creating Q&A JSON Format for Dataset Construction:
@dorumiru
is seeking advice on creating a programming task to extract data from PDFs, chunk it, and use an API like palm2 to generate a dataset in a Q&A JSON format for subsequent training. - Chunking Techniques and Resource Suggestions: In response to
@dorumiru's
question about advanced PDF chunking techniques,@ethux
shared a YouTube video called āThe 5 Levels Of Text Splitting For Retrieval,ā which discusses various methods of chunking text data. - Recommendations and Warnings for Fine-Tuning Tools:
@mrdragonfox
advised caution when using tools like Langchain due to complex dependencies and shared a GitHub link toprivateGPT
, a basic tool for document interaction. They also emphasized āgarbage in, garbage outā highlighting the significance of quality data. - Issues with Configuring Mistral for Fine-Tuning:
@distro1546
inquired about the proper command line for fine-tuning Mistral using the axolotl tool, how to adjustconfig.yml
for their dataset, and posted a discussion thread on GitHub for troubleshooting (https://github.com/OpenAccess-AI-Collective/axolotl/discussions/1161).
Links mentioned:
- Trouble using custom dataset for finetuning mistral with qlora Ā· OpenAccess-AI-Collective/axolotl Ā· Discussion #1161: OS: Linux (Ubuntu 22.04) GPU: Tesla-P100 I am trying to fine-tune mistral with qlora, but Iām making some mistake with custom dataset formatting and/or setting dataset parameters in my qlora.yml fā¦
- The 5 Levels Of Text Splitting For Retrieval: Get Code: https://fullstackretrieval.com/Get updates from me: https://mail.gregkamradt.com/* https://www.chunkviz.com/ Gregās Info:- Twitter: https://twitterā¦
- GitHub - imartinez/privateGPT: Interact with your documents using the power of GPT, 100% privately, no data leaks: Interact with your documents using the power of GPT, 100% privately, no data leaks - GitHub - imartinez/privateGPT: Interact with your documents using the power of GPT, 100% privately, no data leaks
- GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
ā· #showcase (1 messages):
-
LibreChat: A Mix-and-Match Chatbot Platform: User
@dannyavila
presented LibreChat, a versatile platform that supports using the Mistral API alongside other services such as Openrouter, Azure OpenAI, and more. The platform offers features like AI model switching, message search, and is completely open-source for self-hosting, available here. -
Explore LibreChatās Underlying Mechanics: For users interested in diving deeper,
@dannyavila
shared the link to the documentation at docs.librechat.ai, providing insights on how to make the most of LibreChatās expansive features. -
LibreChatās Open Source Cred: Boasting a generous open-source ethos, LibreChat is under the MIT license, showcasing community trust with 6.6k stars and 1.1k forks on its repository.
Links mentioned:
GitHub - danny-avila/LibreChat: Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, Google Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development: Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, Google Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secureā¦
ā· #la-plateforme (13 messagesš„):
- Newcomer Questioning Ease of Use:
@mrrobot7778
expressed concern about the usability of Mistral AI for someone new to the field, doubting if itās meant for users without expertise. - Beam Search Debate: There was confusion regarding the presence of a beam search option in the OpenAI API.
@casper_ai
linked to the API documentation asserting its existence, while@rabdullin
questioned the underlying mechanism. - Under the Hood of Beam Search:
@rabdullin
inquired if the OpenAI API actually runs a beam search or just generates independent outputs.@casper_ai
admitted uncertainty about the specific process but mentioned its effectiveness. - Authentication Concerns Shared:
@pastillafit
raised issues with the authentication process when using the API, specifically regarding password management and lack of two-factor authentication (2FA). They found a workaround for 2FA during password reset but reported it not affecting the console login. - Mistral Mediumās Instruction Following Queried:
@gooningconstantly
asked if mistral-medium is tuned for instruction following, noticing that it sometimes ignores instructions provided in thesystem
role message content.
Perplexity AI Discord Summary
- Swift Batch 6 Perplexity Activation:
@yellephen
experienced an instant activation of Perplexity Pro after being in batch 6. - Rabbit R1 Bundle Deal:
@martsw71
faced hurdles activating Perplexity Pro from a Rabbit R1 purchase;@ok.alex
recommended the consistent use of email across services. - Customize Your Search in Brave:
@witchfinder17
sought advice on making Perplexity the default search engine in Brave; meanwhile,@samangel7358
highlighted the importance of distinguishing between Perplexity AI Search and Companion extensions. - AIās YouTube Homework:
@chiefblink117
was curious to know if Perplexity pulls information from YouTube video audio, clarified by@icelavaman
to be using video transcripts via a YouTube API. - Clash of the AI Titans: A lively debate by
@b4d_7r1p_
and@lord.wex
compared Perplexity Premium and GPT-4 Premium, noting Perplexityās competitive edge in offering access to various premium models, though it lags behind in image generation capabilities. - Stay in Your Lane: In the channel,
@ok.alex
helped to guide@kabbe_the_dude
to the appropriate channel for project sharing, stressing on content organization. - C# Voyage Reporting:
@whoistraian
updated on their progress in learning C# with an imminent exam on January 31, supported by a link: Can you help. - Share and Prosper: Pro users at Perplexity, like
@neuralspace
, spread the love by sharing Perplexity AI referral codes. - API Await on Context Extension: A singular message from
@commuting5048
asked about extending support to a 32k context length in the API; however, no updates or responses followed.
Perplexity AI Channel Summaries
ā· #general (99 messagesš„š„):
- Instant Perplexity Pro Activation:
@yellephen
mentioned instantly receiving a Perplexity Pro link after being in batch 6. - Rabbit R1 Purchase Comes With Perplexity Pro:
@martsw71
discussed issues with activating Perplexity Pro using a link from a Rabbit R1 purchase, and@ok.alex
suggested ensuring the same email is used across services and trying the web version for subscription. - Setting Perplexity as Default Search in Brave:
@witchfinder17
asked about setting Perplexity as the default search in Brave, with@mares1317
suggesting a direct URL for a custom search engine setup, and@samangel7358
pointing out the distinction between Perplexity AI Search and Companion extensions. - Integration of YouTube Transcripts in Perplexity:
@chiefblink117
inquired whether Perplexity sources from YouTube video audio for the AIās responses, with@icelavaman
clarifying that it uses video transcripts provided by a YouTube API. - Perplexity Premium vs. GPT-4 Premium:
@b4d_7r1p_
and@lord.wex
discussed the advantages of Perplexity Premium over GPT-4 Premium for different uses, with Perplexity offering access to various premium models and not falling short in any significant area except image generation compared to its competitor.
Links mentioned:
Perplexity - AI Search: Upgrade your default search engine
ā· #sharing (15 messagesš„):
-
Navigating to the Right Channel:
@ok.alex
redirected@kabbe_the_dude
to the<#1059504969386037258>
channel for project sharing, indicating the importance of using the proper channels for specific content. -
A Journey Through C# Learning:
@whoistraian
shared their learning journey for C#, with an update progress link: Can you help, stating they have an exam on January 31 at faculty. -
Sharing Referral Codes:
@neuralspace
expressed the sentiment that sharing is caring by posting their Perplexity AI referral code link: Referral Code. -
Perplexityās Pro Models Explained:
@core3038
provided insight into the various models available to Pro users on Perplexity AI, like GPT-4 and Claude 2, and shared a detailed blog post for more information: What model does Perplexity use. -
Perplexity AI vs. ChatGPT Comparison:
@far2wise
found an article comparing Perplexity AI with ChatGPT, outlining differences and key points, which can be explored here: Perplexity AI vs ChatGPT.
Links mentioned:
- Perplexity: AI Chatbot & Search Multi-Tool Explained! #88: This video explains Perplexity, a search multi-tool generative AI chatbot ā what it is, how to use it, and why you should! I provide examples for some of theā¦
- Perplexity AI vs ChatGPT: Unveiling The Superior AI-Search Engine 2024: Perplexity AI vs ChatGPT: Which AI Search Engine is Better? Perplexity AI and ChatGPT are both powerful AI-powered search engines.
- What model does Perplexity use and what is the Perplexity model?: Dive deep into Perplexityās technical details with our comprehensive FAQ page. From the nuances of AI models like GPT-4 and Claude 2 to token limits and AI profiles, get concise answers to optimize yoā¦
ā· #pplx-api (1 messages):
- Inquiry About 32k Context Length: User
@commuting5048
inquired about the progress and potential release date for 32k context length support. No further information or responses to this query were provided in the channel messages.
HuggingFace Discord Discord Summary
-
Local RAG Goes Live with langchain and LM Studio:
@thoreau_a_whelan
has successfully implemented a local RAG system that integrates with langchain and LM Studio, enabling search through local documents. -
Introducing a New Vision-Language Model: The Nous-Hermes-2-Vision model, an extension of OpenHermes-2.5-Mistral-7B, introduced by
@andysingal
. It features unique function calling capabilities and is available on Hugging Face. -
AI Integration POC Unveiled by DevSpot:
@devspot
presented a GitHub-based Proof of Concept for a scalable system to work with AI models from various vendors, complete with a GitHub repository and an explanatory YouTube video. -
VRAM Efficient Photorealistic Diffusion Model:
@felixsanz
discussed optimizing PixArt-α to run with less than 8GB of VRAM, providing insights in an article, and welcomed community feedback. -
NLP Insights: Model Caching, Shrinking Transformers, and BERTās Longevity:
@asprtnl_50418
tackled issues with model caching in Docker, suggesting the use of a volume for permanent storage.@stroggoz
shrank a sentence transformer with PCA and knowledge distillation, debating dataset size while also touching on the performance and relevance of BERT compared to RoBERTa and Elektra, and recommended the span marker library for NER.
HuggingFace Discord Channel Summaries
ā· #general (77 messagesš„š„):
-
PDF Data to Dataset Dilemma: User
@dorumiru
sought advice on creating a dataset in the format of context, question, and answers from raw PDF data and inquired about advanced techniques for chunking PDF data. Unfortunately, no responses or further discussion on this topic were provided within the messages available. -
From Software Engineering to AI Research: User
@boss_ev
, a software engineer, asked for advice on transitioning into AI research and was recommended resources such as Fast.ai and Andrej Karpathyās YouTube channel. -
Unsloth AI with a Twist: User
@vishyouluck
mentioned that they are attempting to use Unsloth with Hindi and promised updates, despite exhausting their Collab compute unit and seeking to purchase more. -
Inference Endpoint Ease: User
@dragonburp
cheered the setup simplicity of the inference endpoints, finding it user-friendly and straightforward. -
Linking Hugging Face and GitHub: User
!BeastBlaze
explored ways to link Hugging Face projects to their GitHub account, aiming to enhance their profile for potential employers, and subsequently discussed Space sleeping due to inactivity and billing inquiries for daily usage checking.
Links mentioned:
- Vishal - a Hugging Face Space by VishalMysore: no description found
- stabilityai/stable-code-3b Ā· Hugging Face: no description found
- LoRA): no description found
- burkelibbey/colors Ā· Datasets at Hugging Face: no description found
- llama.cpp/convert-lora-to-ggml.py at master Ā· ggerganov/llama.cpp: Port of Facebookās LLaMA model in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.
ā· #today-im-learning (5 messages):
-
Local RAG Implementation Success Story: User
@thoreau_a_whelan
shared their excitement about getting local RAG (Retriever-augmented generation) to work with langchain and LM Studio for searching through local documents. -
GitHub Actions Permissions Conquered:
@vipitis
reported navigating the difficulties of setting up specific permissions for GitHub Actions, describing the process as painful. -
Progress on DoReMi and FP8 Training in Parallelism:
@neuralink
has made significant strides, writing 90% of DoReMi and 30% of an end-to-end FP8 training in 3D parallelism, successfully implementing the forward and backward passes. -
Distillation of Metaās Self-Rewarding Language Models Paper:
@subham5089
shared a simplified summary of Metaās new paper, āSelf-Rewarding Language Modelsā. The summary is available as a LinkedIn post. -
Mad_cat__ Wraps Their Head Around Skillchains: User
@mad_cat__
indicated they have finally understood Skillchains, though no further context was provided about the nature of these skillchains.
ā· #cool-finds (3 messages):
-
Bilingual Model Drops by Hugging Face: User
@sofiavas
mentioned Hugging Faceās trend of releasing bilingual models, highlighting recent models in German and Chinese. -
Introducing Nous-Hermes-2-Vision:
@andysingal
showcased the Nous-Hermes-2-Vision, a novel Vision-Language Model building upon the OpenHermes-2.5-Mistral-7B by teknium. The modelās details can be viewed on Hugging Face. -
Unique Function Calling Feature in Nous-Hermes-2-Vision:
@meatfucker
pointed out a distinctive aspect of the Nous-Hermes-2-Vision model, noting its capability for function calling.
Links mentioned:
NousResearch/Nous-Hermes-2-Vision-Alpha Ā· Hugging Face: no description found
ā· #i-made-this (9 messagesš„):
-
Felix Unleashes VRAM Efficiency:
@felixsanz
shared an article on optimizing the photorealistic diffusion model called PixArt-α to run with less than 8GB of VRAM. They expressed hope the community finds the content useful and invited feedback for improvement. -
Community Applause for Felix:
@gugaime
praised@felixsanz
for the informative articles on Stable Diffusion, mentioning they aim to implement the examples provided. The appreciation was acknowledged by@felixsanz
with a thank you and a hugging rocket emoji. -
Curiosity for PixArt-αās Choice:
@sofiavas
inquired why PixArt-α was chosen by@felixsanz
for optimization over OpenAIās 8k models, showing interest in the rationale behind the decision. -
First Package Triumph:
@vipitis
celebrated publishing their first package to the Python Package Index (PyPI). -
DevSpotās AI Integration POC:
@devspot
introduced a Proof of Concept (POC) on GitHub that outlines a scalable approach for working with various AI vendor models and shared the link to their GitHub repository alongside a YouTube video explaining their concept. -
Mysterious Message Mentioning a Discord Channel:
@Amanita
simply posted<#897390720388825149>
, which appears to be a mention of another Discord channel, without any additional context provided.
Links mentioned:
- GitHub - devspotyt/open-models: Contribute to devspotyt/open-models development by creating an account on GitHub.
- Mix-and-Match AI - Open Models, The Game Changer!: A brief video explaining the concept behind Open Models, a brand new open-sourced code which allows for an easy integration and usage of various models & AI ā¦
- PixArt-α with less than 8GB VRAM: Perform the inference process of this generative image model with just 6.4GB of VRAM
ā· #reading-group (1 messages):
skyward2989: https://arxiv.org/html/2401.10020v1
ā· #computer-vision (1 messages):
swetha98: Any one knows any libraries for Intelligent character recognition
ā· #NLP (8 messagesš„):
-
Docker Dilemma: Caching Models vs. Volume Storage:
@asprtnl_50418
discussed the downside of caching models in Docker: changing any layer or testing another model results in the cache being cleared. The solution lies in using a volume for host permanent storage, which also facilitates model sharing between containers due to their large sizes. -
Model Diet: Shrinking a Sentence Transformer:
@stroggoz
successfully shrank a sentence transformer using PCA and knowledge distillation but is seeking advice on the size of the dataset required for training the compressed model, given the original was trained on a billion sentences. -
BERT: An Olde but a Goode?:
@frosty04212
inquired if BERT is now outdated for token classification, given their assessment of different models for best performance.@stroggoz
responded, suggesting that while BERT may be less efficient due to quadratic complexity, it is still very much used and there may not be many better alternatives for token classification. -
Comparing NLP Titans:
@stroggoz
continued the conversation by stating that RoBERTa and Elektra might perform slightly better than BERT. They noted RoBERTaās faster tokenizer and mentioned that they still use BERT frequently because of its extensive model ecosystem. -
NER Model Recommendation: In the area of token classification for Named Entity Recognition (NER),
@stroggoz
recommended using the span marker library.
OpenAccess AI Collective (axolotl) Discord Summary
-
GPU Memory Challenges with FFT on 7H100: Users reported out-of-memory (OOM) errors while running FFT on 7H100 GPUs, discussing the usage of
zero3bf16
with Mixtral framework as a potential solution to alleviate the issue. -
Google Automates Code Review Comments: A new paper by Google introduces machine learning approaches to automate the resolution of code review comments, promising to accelerate the development cycle.
-
FastChatās LLM Benchmarking Tools: The community explored language model evaluation using FastChatās LLM judge, with discussions on integrating VLLM with Fast Eval and utilizing a backend flag for this purpose.
-
Orion-14Bās Multilingual Prowess Marred by Trust Issues: OrionStarAI released a new Orion-14B model with claims of strong multilingual support, sparking debates over trustworthiness without a contamination check, highlighted in its Hugging Face repository.
-
Model Evaluation Balancing Act: Conversations revolved around the cost-effectiveness of evaluating language models using API calls, with metrics like FastEvalās $5 per evaluation being brought to the table.
-
Phi2 Modelās Config Conundrum Corrected: An error in Phi2ās model configuration was reported, leading to a pull request on GitHub to fix the cofig class inconsistency in the modelās YML file.
-
Tips for Effective Layer Freezing and Fine-Tuning: Axolotl users shared guidelines on freezing layers with LoRA configurations and offered troubleshooting advice for common issues such as fine-tuning crashes, emphasizing the utility of
val_set_size: 0
. -
Local Datasets Welcomed by DPO with Intel-format Agreement: Compatibility of local datasets for Direct Prompt Optimization (DPO) was confirmed if the data formatting agrees with Intelās structure.
-
Solar LLM Embraces the Llama Light: Discussions concluded that the SOLAR-10.7B model should be classified under the āllamaā model category based on scale and architecture, and provided a link to its Hugging Face page.
-
Learning Rate and Sample Origin Optimizations for DPO: Emphasis was placed on carefully choosing lower learning rates and using the modelās own bad samples for effective DPO, as shared in a Hugging Face discussion thread.
-
Replicate Help Sought for predict.py Autoawq and vlllm Setup: A user sought guidance on setting up
predict.py
autoawq and vlllm on Replicate.
OpenAccess AI Collective (axolotl) Channel Summaries
ā· #general (32 messagesš„):
- OOM in FFT with High-End GPUs:
@dangfutures
reported out-of-memory (OOM) errors while trying to execute FFT on 7H100 GPUs and conversed with@caseus_
about usingzero3bf16
with Mixtral framework as a way to mitigate the issue. - Addressing Reviewer Comments with AI:
@noobmaster29
shared a new paper by Google on ML-based automation to assist in resolving code review comments, speeding up the development process. - Benchmarking with FastChat: Users discussed options for evaluating language models with
@gahdnah
pointing to FastChatās LLM judge and@dangfutures
inquiring about integrating VLLM with Fast Eval, which@rtyax
confirmed as possible using a specific backend flag. - New Orion-14B Language Model Debuts:
@bratao
provided a link to the OrionStarAIās new Orion-14B model which boasted strong multilingual capabilities, prompting mixed reactions from the community questioning trust without a contamination check and model longevity. - Costs of Model Evaluation Using API Calls:
@noobmaster29
questioned the cost of evaluating language models using API calls, with@nanobitz
stating that FastEval costs about $5 per evaluation.
Links mentioned:
-
[
Resolving Code Review Comments with Machine Learning
](https://research.google/pubs/resolving-code-review-comments-with-machine-learning/): no description found
-
OrionStarAI/Orion-14B-Base Ā· Hugging Face: no description found
-
FastChat/fastchat/llm_judge/README.md at main Ā· lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. - lm-sys/FastChat
-
FastChat/fastchat/llm_judge at main Ā· lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. - lm-sys/FastChat
ā· #axolotl-dev (10 messagesš„):
-
Phi2 Model Revision Error Reported:
@asterix3651
shared a model revision error for phi2, revealing a config class inconsistency.@caseus_
acknowledged the issue and promised a quick fix once they have computer access. -
Pull Request for Model Config Loader: In response to
@asterix3651
ās report,@caseus_
submitted a pull request to ensure the model config loader respects the model_revision, addressing the config class mismatch issue. -
Relevance of Speed Enhancements Discussed:
@tiendung
mentioned that speedup claims, such as a x30 speedup reported for pro unsloth version, are only significant if the samples are relevant to the same topic. -
Skepticism Over Unslothās Speed Claims:
@dreamgen
expressed skepticism, suggesting Unslothās claimed speedup is based on non-practical setups.@faldore
and@dreamgen
discussed that the merits of software like Unsloth could be due to factors other than training speed, with@dreamgen
highlighting its customizability.
Links mentioned:
- axolotl/examples/phi/phi2-ft.yml at main Ā· OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
- make sure the model config loader respects the model_revision too by winglian Ā· Pull Request #1160 Ā· OpenAccess-AI-Collective/axolotl: Description reported in discord: ValueError: The model class you are passing has a
config_class
attribute that is not consistent with the config class you passed (model has <class ātransformeā¦
ā· #general-help (25 messagesš„):
-
Beginnerās Guide to Layer Freezing with LoRA:
@diabolic6045
inquired about freezing model layers using Axolotl and was informed by@nanobitz
to start with thelora.yml
config which freezes most of the layers.@nanobitz
also reassured@diabolic6045
that itās safe to experiment with these settings. -
Troubleshooting Fine-Tuning Crashes:
@fred_fups
experienced consistent crashes when fine-tuning Mistral 7B on 3 epochs with QLoRA at exactly 33%.@nanobitz
suggested a solution by settingval_set_size: 0
to potentially avoid crashing during evaluation. -
Local Dataset Dilemma Resolved:
@c.gato
inquired about DPO support for local datasets and@dangfutures
confirmed compatibility after formatting to match Intelās structure. -
Mixtral Yaml Flexibility for Any Model:
@caseus_
revealed that unfrozen parameters options are available for all models, not just Mixtral. When@diabolic6045
asked for documentation to figure out parameters, there was no direct link provided. -
Solar LLM Classification Clarified: Several users, including
@dangfutures
,@noobmaster29
, and@nanobitz
, discussed how to set the newly introduced SOLAR-10.7B model, concluding that it should be classified as a āllamaā model, considering its scale and architecture.
Links mentioned:
upstage/SOLAR-10.7B-v1.0 Ā· Hugging Face: no description found
ā· #datasets (14 messagesš„):
-
DPO Requires Finer Learning Rate Tuning: āxzuynā observed that Direct Prompt Optimization (DPO) requires a significantly lower learning rate compared to Supervised Fine-Tuning (SFT)āpotentially one order of magnitude lower. They provided an example, suggesting that if 0.0001 is used for SFT, 0.00001 might be more appropriate for DPO, and mentioned related insights by ājonā available in a discussion on Hugging Face (DPO learning rates discussion).
-
Using Modelās Own Bad Samples for DPO is Advantageous: āxzuynā argued that using poorly generated samples from oneās own model as the ārejectedā data for DPO can yield more effective and rapid results than using artificial āfakeā bad results.
-
Choosing the Right Rejected Samples: The importance of selecting appropriate rejected samples for DPO was emphasized by ādreamgenā and āxzuynā, with the latter noting that using samples from the model itself, particularly with modified sampler settings to encourage ābadā yet coherent outputs, can be a productive strategy.
-
DPO for Subtle Model Adjustments: According to āxzuynā, DPO could be seen as a ātiny nudgeā for model finalization, implying that it works best when chosen and rejected samples are not too dissimilar to what the current model can generate. They suggest DPO is more suitable for incremental refinements rather than broader changes.
-
DPO Easily Corrects ChatGPT Idiosyncrasies: āxzuynā recommended using DPO to fix common GPT mannerisms like ending sentences with āin conclusionā¦ā or starting with āSurely,ā noting that DPO can easily remove these tendencies when they seep into models through the training data.
ā· #replicate-help (1 messages):
dangfutures: does anyone know how to setup predict.py autoawq and vlllm on replicate lol
LlamaIndex Discord Discord Summary
-
Marco Bertelli Guides Chatbot Developers: Marco Bertelliās comprehensive series offering insights on creating a full-stack RAG chatbot, covering algorithms and full-stack development continues to gain interest. Developers can access the guide through the link in the shared Tweet and view related images.
-
Innovating with Embedding Models for RAG: Discussion around the M2-BERT-80M-32k-retrieval model showcases its capabilities for semantically grounded long-context embeddings in RAG. The model addresses embedding chunking issues and is detailed further in a Tweet and additional imagery.
-
RAG Maestro Opens Doors to ArXiv Insights: Aymen Kallala introduced RAG-Maestro, a web application utilizing RAG to improve research on ArXiv through keyword extraction and indexing. The tool was highlighted in a Tweet with an illustrative guide here.
-
Memory and Cosine Similarity Tools Hot Topics in Discussions: Lack of memory support for query engines contrasts with available tools to calculate cosine similarity; engineers should refer to LlamaIndex docs for Chat Engines and Agents for memory implementation.
-
Gemini Pro Enhances Invoice Data Search with LlamaIndex: The efficient searching and retrieval of semi-structured invoice data sees advancement with Gemini Pro and LlamaIndex, providing a notable step forward for businesses dealing with such digital documents. The impact on the digital universe is discussed in a Medium article by
@andysingal
.
LlamaIndex Discord Channel Summaries
ā· #blog (5 messages):
-
Comprehensive RAG Chatbot Tutorial Series by Marco Bertelli: Marco Bertelliās multistep guide on building a full-stack RAG chatbot is celebrated for its depth, covering algorithms, and both front and backend development. See the ongoing series in the shared Tweet and accompanying image here.
-
Semantically Grounded Long-Context Embedding Models: The M2-BERT-80M-32k-retrieval model presented by
@JonSaadFalcon
and others introduces a solution to the embedding chunking issue in RAG by grounding retrieval in higher-level semantic context. Further details can be found in the linked Tweet and image here. -
Webinar to Discuss Agentic Software Development: The LLMCompiler will be the focus of a 2024 webinar presented by
@sehoonkim418
and@amir__gholami
, offering insights into building efficient, performant agentic software. Read more about the agent compiler for parallel multi-function planning/execution in the announcement Tweet with a visual sneak peek here. -
RAG-Maestro Tool for ArXiv Research: RAG-Maestro, developed by Aymen Kallala, is a web application that uses RAG to look up scientific concepts in papers on ArXiv, employing keyword extraction and on-the-fly indexing. The LlamaIndex shared this innovative tool in their Tweet and provided a visual guide here.
-
Building a Full-Stack Complex PDF AI Chatbot Overview: Nipuna from Paragon AI offers insights into creating complex PDF AI chatbots capable of processing numerous intricate documents, detailed in a recent overview. The challenges of handling 40+ docs and thousands of pages with embedded tables are explored in the Tweet and related image here.
ā· #general (48 messagesš„):
- Memory Module for Query Engine:
@nerdai
clarified that LlamaIndex does not support memory for query engines, and recommended using Chat Engines and Agents for memory capabilities. They provided a link to documentation explaining how to implement a SimpleChatStore and ChatMemoryBuffer. - Cosine Similarity Tool Inquiry:
@kush2861
asked about a distances_from_embeddings calculator similar to one from OpenAI.@nerdai
confirmed its availability to calculate the cosine similarity of two embeddings. - Dataset Generator Worker Enhancement Query:
@dangfutures
inquired about the possibility of increasing the number of workers for the dataset generator, to which@nerdai
responded that they have not built in multi-processing for any of their generators yet. - Building Autonomous Vector Storage:
@lhc1921
sought guidance on constructing an auto merge vector storage without an LLM service context.@kapa.ai
said that the extracts provided did not detail building such a system and directed@lhc1921
to the official LlamaIndex documentation. - Conversational Retrieval Agents with Memory:
@peeranat_fup
asked for examples on how to build a Conversational Retrieval Agent with memory using LlamaIndex. Despite several attempts to find a proper example,@kapa.ai
recommended referring to the LlamaIndex documentation or the GitHub repository due to a lack of specific examples in the provided extracts.
Links mentioned:
- DLAI - Building and Evaluating Advanced RAG: Introduction Ā· Advanced RAG Pipeline Ā· RAG Triad of metrics Ā· Sentence-window retrieval Ā· Auto-merging retrieval Ā· Conclusion
- Chat Engine - Context Mode - LlamaIndex š¦ 0.9.34: no description found
- Chat Stores - LlamaIndex š¦ 0.9.34: no description found
- Prompts - LlamaIndex š¦ 0.9.34: no description found
- Accessing/Customizing Prompts within Higher-Level Modules - LlamaIndex š¦ 0.9.34: no description found
ā· #ai-discussion (1 messages):
- Gemini Pro and LlamaIndex Advance AI Search:
@andysingal
shared a Medium article discussing how Gemini Pro and LlamaIndex are aiding in the efficient retrieval of semi-structured invoice data. The introduction highlights the significance of this technology in the digital universe.
Links mentioned:
Unlocking Efficiency: A Search Query for Semi-Structured Invoices with Gemini Pro and LlamaIndex inā¦: Ankush k Singal
LangChain AI Discord Summary
-
Cheers for LangChain.js Milestone: LangChain.js contributors received appreciation, with special thanks to
@matthewdparker
for resolving a token text splitter issue. The Twitter acknowledgment celebrates progress since the launch of version 0.1.0. -
Hosting and Troubleshooting LangChain Discussions: Hosting recommendations for LangChain backends included Heroku and porter.run, while an installation issue involving a urllib3 connection pool was reported without a resolution follow-up. A query about integrating LangChain with React was clarified; it functions as a backend requiring API requests from frontend frameworks.
-
Social Cause Meets Software: A call for software development assistance was made for a project to support autistic and neurodivergent individuals, offering prompt structuring expertise in return.
-
LangServe Feedback Feature Inquiry: An observation was made about the missing PATCH endpoint for LangServeās
enable_feedback
function, indicating a possible addition by the inquirer despite its presence inlangsmith-sdk
. -
Multifaceted AI Projects and Insights Shared: Demonstrations of AI implementations included a GitHub docs demo, support for a neurodivergent assistance project, a text-based dungeon game, development of a multilingual RAG project on GitHub, and a Medium article examining the role of metadata in enhancing language models.
LangChain AI Channel Summaries
ā· #announcements (1 messages):
- Appreciation for LangChain.js Contributors:
@jacoblee93
and@Hacubu
expressed gratitude towards everyone who contributed to the development of LangChain.js this year. Special thanks were given to@matthewdparker
for fixing a token text splitter overlap issue, marking a significant milestone since the launch of version 0.1.0. Read the full acknowledgment on Twitter.
Links mentioned:
Tweet from Jacob Lee (@Hacubu): Thank you to everyone whoās contributed to @LangChainAI (so far) this year! So much has happened with and since the launch of 0.1.0, and it wouldnāt have been possible without: š matthewdparker forā¦
ā· #general (22 messagesš„):
-
LangChain Hosting Suggestions Sought: User
@b0otable
asked for recommendations on services to host a LangChain backend service that utilizes OpenAI models.@ricky_gzz
suggested Heroku for prototyping and porter.run on AWS for more production-grade needs, while@baytaew
offered to assist with trying out langserve by contacting [email protected]. -
Troubleshooting LangChain Installation:
@rrvermaa_79263
encountered an error with a urllib3 connection pool while trying to install langchain-community and asked for guidance to resolve this issue. -
LangChain and React Development Query:
@yasuke007
inquired about using LangChain with React, and@esponges
clarified that LangChain is a backend tool, which would require React to make requests to such a backend. -
Assistance Sought for Autistic and Neurodivergent Support Project:
@brotino
, an RN and member of the autism spectrum, described their project to support autistic adults and sought assistance from the community for software development challenges, offering their skills in prompt structuring in exchange. -
Using LangChain with Hugging Face Models:
@esraa_45467
inquired about implementing features akin to LangChaināsChatOpenAI
using Hugging Face models, by sharing a code snippet for context.
Links mentioned:
Tweet from Preston Thornburgš”ļø (@ptonewreckin): Hey @LangChainAI ⦠you guys doing okay? Your tweet pointing to https://langchain.fi/ seems pretty sketchy
ā· #langserve (1 messages):
- Query About LangServe Feedback Feature:
@georgeherby
inquired about the lack of a PATCH endpoint for updating feedback with theenable_feedback
flag in LangServe, indicating they might add it themselves. They noticed the existence of the function in thelangsmith-sdk
codebase and suspected it might have been an oversight rather than a deliberate omission.
ā· #langchain-templates (1 messages):
jackblack1.: Does anyone have a template for langchain OpenAI assistant with DuckDuckGo search
ā· #share-your-work (5 messages):
-
Showcasing GitHub Docs Demo: User
@jonathan0x56
shared a GitHub pull request for a demo project that includes documentation with images and aims to bootstrap a docs repository using materials from langchain-ai/langchain for demonstration purposes. -
Call to Action for a Neurodivergent Support Project: User
@brotino
seeks support for a project to aid autistic adults and the neurodivergent community. They offer their skills in prompt structuring and troubleshooting in exchange for help with software development. -
Dungeon Game Link Shared: User
@friday_living
provided a link to Gemini Dungeon, but did not include further details or description about the content. -
Introduction of Multilingual RAG Development: User
@akashai4736
presented their GitHub repository for a multilingual RAG (Retrieval Augmented Generation) project, showcasing its potential for development in collaboration with Langchain Cohere. The GitHub link can be found here. -
Medium Article on Language Models and Data: User
@rajib2189
shared a Medium article discussing the importance of metadata in addition to data when developing language model-based applications using the RAG framework. The article challenges the common belief that more data alone enhances language models.
Links mentioned:
- Gemini Dungeon - Text and Image Based Adventure in DND5E: no description found
- Data is Not what All You Need: The headline of this blog may have prompted a few raised eyebrows or even disbelief. āIs he out of his mind?ā might be a question crossingā¦
- GitHub - akashAD98/Multilingual-RAG: multilingual RAG: multilingual RAG . Contribute to akashAD98/Multilingual-RAG development by creating an account on GitHub.
- alttexter-ghclient DEMO by jonathanalgar Ā· Pull Request #1 Ā· jonathanalgar/docs-demo: Letās say we want to bootstrap a docs repo. We have five shiny new docs to start with (1x md, 1x mdx, 3x ipynb borrowed from langchain-ai/langchain for our demo purposes). All the docs have imagesā¦
DiscoResearch Discord Summary
-
Marlin Swims into AutoGPTQ: The AutoGPTQ repository has been updated to include the marlin kernel, known for its speed and impressive performance, despite having certain limitations, as seen in a pull request update. Meanwhile, performance benchmarks for 4-bit quantized Mixtral on an A100 GPU yielded 9 tokens per second with a batch size of 64.
-
Coders Write Custom CUDA: Discussions hinted at industry professionals like Tri Dao potentially using custom CUDA kernels, which implies advanced optimization techniques in AI models might be more widespread. Training language models using 4-bit quantization from bitsandbytes sparked questions about capabilities similar to GPTQ or AWQ in other quantization schemes.
-
Mind of Kahneman in AI Form: Ambitions to develop an AI agent emulating the cognitive style of Daniel Kahneman were shared, with suggestions to prompt an LLM with his persona or fine-tune on his works. A recent arXiv paper on Self-Rewarding Language Models was highlighted, showing performance surpassing GPT-4 by using self-provided rewards during training.
-
Boosting German Dataset for DPR: The release of Version 2 of the German DPR training dataset adds formal and informal imperative questions to its structure, improving its complexity and utility, with a call for feedback and contributions on GitHub.
-
German LLMs Gain Steam: The conversation covered self-supervised learning adaptations for fine-tuning, excitement about German LLM release, and available quantized versions of the DiscoLM German 7B model. For fine-tuning needs, the Axolotl toolkit was recommended, along with Llama-factory as an alternative to complicated fine-tuning tools.
DiscoResearch Channel Summaries
ā· #mixtral_implementation (6 messages):
-
Marlin Kernel Added to AutoGPTQ:
@vara2096
shared a GitHub pull request indicating the addition of the marlin kernel to the AutoGPTQ repository, noting marlinās impressive speed and performance despite its limitations. -
Benchmarking Mixtralās Performance:
@vara2096
reported achieving a throughput of 9 tokens per second for a 4-bit quantized Mixtral on an A100 GPU, with a batch size of 64. -
Clarification on Throughput Measurement: In a clarification to
@bjoernp
,@vara2096
confirmed the throughput measurement to be 9 tokens per second serially, rather than 9x64 tokens per second.
Links mentioned:
add marlin kernel by qwopqwop200 Ā· Pull Request #514 Ā· AutoGPTQ/AutoGPTQ: Add marlin kernel. marlin is a very powerful gptq kernel. Although there are many limitations to the applicable model, the speed is nevertheless very close to theory. Also, fused attention is not yā¦
ā· #general (7 messages):
- Custom CUDA kernels in AI models:
@muhtasham
pointed out that despite claims of not using quantization, certain industry professionals like Tri Dao are known for writing custom CUDA kernels which could indicate advanced optimization techniques in AI models. - Training on quantized models using bitsandbytes:
@vara2096
inquired about the ability to train LoRAs on top of a quantized model using 4-bit quantization from bitsandbytes and asked if any other quantization schemes such as GPTQ or AWQ allow for similar capabilities. - Aspiring for an AI Mind like Kahneman:
@sabu7003
proposed the concept of developing an AI agent emulating the thought process of behavioral economist Daniel Kahneman. This AI would integrate machine learning with Kahnemanās principles to potentially offer business and marketing consultations. - Recommendations for building a Kahneman-like AI:
@rasdani
suggested that this Kahneman-like AI could be approached by prompting an LLM with Kahnemanās persona or fine-tuning on his publications, also mentioning character.ai as a potential resource and the influence of Kahnemanās ideas on AI and reinforcement learning research. - Self-Rewarding Language Models Outperforming GPT-4:
@philipmay
shared a recent research paper on Self-Rewarding Language Models (arXiv:2401.10020), highlighting a new training method where a model uses itself as a judge to provide its own rewards, resulting in performance surpassing that of GPT-4 and others on the AlpacaEval 2.0 leaderboard.
Links mentioned:
Self-Rewarding Language Models: We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human preferā¦
ā· #embedding_dev (1 messages):
- German DPR Dataset Enhanced:
@philipmay
announced that Version 2 of the German DPR training dataset is complete, now featuring normal questions, formal (sie) imperative questions, and newly added informal (du) imperative questions. Feedback is solicited, and the dataset is available at the German dataset for DPR model training on GitHub.
Links mentioned:
GitHub - telekom/wikipedia-22-12-de-dpr: German dataset for DPR model training: German dataset for DPR model training. Contribute to telekom/wikipedia-22-12-de-dpr development by creating an account on GitHub.
ā· #discolm_german (8 messagesš„):
- SF Trainer Shares Insights: User
@_jp1_
discussed employing self-supervised learning (SSL) techniques where answers from early model iterations are rejected in favor of ground truth during the fine-tuning process, similar to an approach taken by Intel with their neural chat. - Legal Eagle Excited by German LLMs: User
@rapsac.
expressed gratitude for the release of the German language LLMs and is optimistic about applying fine-tuning to German legal datasets, anticipating performance between GPT-3.5 and GPT-4 levels. - Quantized DiscoLM German 7b Models Released: User
@rasdani
shared quantized versions of the DiscoLM German 7B model, detailing the assistance of Massed Compute and providing comprehensive links to various quantized models. - How to Fine-Tune DiscoLM German?: User
@thomasrenkert
inquired about methods to fine-tune the DiscoLM German model, to which@bjoernp
responded by recommending the Axolotl toolkit. - Seeking Simpler Fine-Tuning Methods: After
@thomasrenkert
mentioned difficulties with fine-tuning directly in oobabooga, user@nyxkrage
suggested Llama-factory as a possibly more user-friendly alternative.
Links mentioned:
- GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
- TheBloke/DiscoLM_German_7b_v1-AWQ Ā· Hugging Face: no description found
- TheBloke/DiscoLM_German_7b_v1-GPTQ Ā· Hugging Face: no description found
- TheBloke/DiscoLM_German_7b_v1-GGUF Ā· Hugging Face: no description found
Latent Space Discord Summary
-
Podcast Pride and Educational Recs:
@swyxio
announced their podcast hitting #16 on the podcast charts, and state a shared excitement among guild members. An educational resource explaining the transformer architecture behind LLMs was highlighted by@guardiang
, providing a YouTube link for fellow tech enthusiasts. -
Elicit and Anthropic in the Spotlight: The utility of elicit.org was recommended by
@swyxio
for insights on user needs, while@aravindputrevu
sought technical assistance from someone at Anthropic. -
Deciphering the Self-Attention Enigma: Discussions led by
@swyxio
and@eugeneyan
delved into how self-attention matrices at <8k are manageable but require clever techniques like ārope and yarnā and practical tricks for larger contexts, referencing FlashAttention and the use of alibi. -
Superhuman Feedback Frontier Unveiled: A new method involving language models generating and evaluating their own rewards was brought up by
@swyxio
, spotlighting a tweet by@jaseweston
which reflects growing interest and potential implications in the field, supported by an arXiv paper. -
Simple Thanks and Corporate Pod Curiosity: User
@420gunna
offered a straightforward expression of gratitude, and guild members discussed the surprising popularity of the corporate-branded a16z podcast.
Latent Space Channel Summaries
ā· #ai-general-chat (14 messagesš„):
- Simple Gratitude from 420gunna: User
@420gunna
expressed thanks with a simple āThanks šāāļøā. - Podcast Chart Climbers:
@swyxio
shared that their podcast ranked #16 on the charts, surpassing Y Combinator, while@420gunna
contributed to the rise by listening during a bike ride. - Elicit.org Mention for User Needs:
@swyxio
recommends checking out elicit.org and highlights@914974587882700800
for insights on user needs. - A16z Podcastās Surprising Popularity:
@austintackaberry
and@swyxio
discussed how the a16z podcast maintains high rankings despite a perceived corporate brand. - Request for Assistance from Anthropic: User
@aravindputrevu
is in search of someone from Anthropic to offer help. - Educational Resource on Transformers:
@guardiang
praised and shared a YouTube video that explains the transformer architecture behind LLMs.
(Note: Links and references to specific users are based solely on the given chat history, with no external sources or additional context available from the systemās knowledge.)
Links mentioned:
- Bloomberg - Are you a robot?: no description found
- Transformers explained | The architecture behind LLMs: All you need to know about the transformer architecture: How to structure the inputs, attention (Queries, Keys, Values), positional embeddings, residual connā¦
ā· #llm-paper-club (6 messages):
-
Clarifying the Size of Self-Attention Matrices: @swyxio pointed out that for context windows <8k, a full self-attention matrix is feasible, but techniques used for >100k are not public, and they likely involve methods that avoid computing the full matrix. They mentioned ārope and yarnā as potential artificial context extension techniques that could be used.
-
Insight into Practical Tricks for Large Contexts: @eugeneyan explained that even though 128k x 128k matrices could theoretically exist, tricks like computing in loops and caching vectors as described in FlashAttention and utilizing alibi for context size, as discussed in Ofir Pressās post, are practical ways to manage large contexts without needing the full matrix.
-
Validating Intuitions About Attention Scalability: @dzidex expressed appreciation for the clarity provided by swyxio and eugeneyan on how transformers handle large context windows, confirming their intuition about the computational feasibility.
-
Noteworthy Paper on Self-Rewarding Language Models: @swyxio shared that the self-rewarding LLM paper is gaining notable attention. The approach described in the paper involves using language models to generate and then evaluate their own rewards, potentially paving the way for āsuperhuman feedback,ā as highlighted in the tweet by @jaseweston and detailed in the corresponding arXiv paper.
Links mentioned:
- Tweet from Jason Weston (@jaseweston): šØNew paper!šØ Self-Rewarding LMs - LM itself provides its own rewards on own generations via LLM-as-a-Judge during Iterative DPO - Reward modeling ability improves during training rather than stayingā¦
- FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness: Transformers are slow and memory-hungry on long sequences, since the time and memory complexity of self-attention are quadratic in sequence length. Approximate attention methods have attempted to addrā¦
- The Use Case for Relative Position Embeddings: Weāre in 2022 but many of our most popular causal language models (LMs), including GPT-3, still use absolute positional embeddings. I believe we should stop using those and move to relative positionalā¦
Skunkworks AI Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- Aspiring to AI Pantheon:
sabu7003
proposed an ambitious project to create an AI that mirrors the thinking of behavioral economics expert Daniel Kahneman, with the aim of delivering nuanced consultations like Kahneman himself. They invited thoughts on the feasibility of this project using Transformer Architecture. - Event Scheduling Dilemma:
yikesawjeez
highlighted the lack of events on the calendar and suggested planning the event today, whilefar_el
responded with availability for planning tomorrow due to a busy schedule today. - Collaborative Workspace Query:
yikesawjeez
proposed testing simultaneous access to the lab on the basementagiclub login and asked.mrfoo
to create and save a notebook in/work
to confirm shared accessibility. - Note Sharing Experimentation:
yikesawjeez
and.mrfoo
discussed logistics of sharing notes and accessing notebooks on a shared account, with.mrfoo
initially working on their own account but expressing willingness to test joint account access later. - Tasks for Contributions:
dook4
requested a list of tasks or material to read through to determine potential areas for contribution to the project.
LLM Perf Enthusiasts AI Discord Summary
-
Mixtral Models Face Sagemaker Hurdle:
@ajamjoom
encountered a TypeError when trying to host Mixtral-Instruct on Sagemaker PD4 with TRT-LLM, which was missing the'trtllm_modules_to_hf_modules'
argument inLoraConfig.from_hf()
. -
Nous-Hermes System Prompt Hack: A Twitter post by @Teknium1 suggests using a system prompt for better outputs in Nous-Hermes 2 Mixtral.
-
In Pursuit of Extended Contexts:
@alyosha11
is seeking efficient methods to increase context length in models like Yarn and Rope, with@ivanleomk
mentioning self extend as a possible avenue, as discussed on Twitter. -
Infrastructure Insights Wanted:
@ayenem
sparked a call for sharing insights on batch versus online processing, deployment infrastructures, re-training necessities, and related tooling, while@jeffreyw128
queried about the proper placement for infrastructure discussions within the community channels. -
Enhancing Reranking with ColBERT: In the #rag channel,
@shacrw
highlighted a Twitter update about reranking with ColBERT but did not provide further context or a detailed discussion on the matter.
LLM Perf Enthusiasts AI Channel Summaries
ā· #opensource (6 messages):
-
Sagemaker and TRT-LLM Compatibility Issues:
@ajamjoom
is seeking advice on hosting Mixtral-Instruct (or any Mistral model) on Sagemaker PD4 with TRT-LLM due to a custom Docker image error. The TypeError in question is related toLoraConfig.from_hf()
missing the'trtllm_modules_to_hf_modules'
argument. -
System Prompt as a Solution: While not directly related to the initial issue,
@ajamjoom
shared a link from@Teknium1
suggesting the use of a system prompt to avoid weird outputs in Nous-Hermes 2 Mixtral, referencing a Twitter post. -
Seeking Ways to Increase Context Length:
@alyosha11
inquired about the best method to increase context length today, expressing dissatisfaction with Yarn and Rope. -
Self-Extend as a Potential Solution: Replying to the context length concern,
@ivanleomk
recommended looking into self extend, which has been recently discussed on Twitter. However, Ivanleomk has yet to try it personally.
Links mentioned:
Tweet from Teknium (e/Ī») (@Teknium1): Okay I found what may be a solution to anyone getting weird outputs from Nous-Hermes 2 Mixtral. Use a system prompt by default. I was able to reproduce rambling or failure to stop properly in transfoā¦
ā· #feedback-meta (2 messages):
- Brainstorming Infrastructure and Use Cases:
@ayenem
proposed a discussion on experiences and ideas regarding batch vs. online processing, deployment infrastructures tailored to specific use cases and constraints, as well as frequent re-training needs, tooling, and learned lessons. - Query on Infrastructure Channelās Placement:
@jeffreyw128
mentioned that there used to be an infrastructure channel and questioned whether such discussions should be categorized under performance.
ā· #rag (1 messages):
shacrw: reranking with ColBERT https://twitter.com/virattt/status/1749166976033861832
Alignment Lab AI Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- Envisioning an AI Top Thinker: User
@sabu7003
proposed the idea of developing an AI agent with the expertise of behavioral economist Daniel Kahneman that can provide consultations and solutions in marketing and management. They asked whether such an application using Transformer Architecture has been considered. - Character AI in Action: In response to
@sabu7003
,@desik_agi
pointed out that Character AI has made it possible to interact with digital versions of historical figures like Socrates or Steve Jobs, which might align somewhat with@sabu7003
ās vision. - Beyond Transformer Limitations:
@rusch
highlighted that the main challenge is not the Transformer architecture but rather the limitations of current language modeling data and approaches, suggesting that more is needed to fulfill the vision discussed by@sabu7003
. - Identifying Development Avenues for AI:
@rusch
further added that future breakthroughs in AI might come from developments in multimodal systems, self-play, and advanced planning capabilities, pointing toward potential growth areas in the quest to develop more sophisticated AI agents.
The Datasette - LLM (@SimonW) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.