We have long contended that the RAG style operations have been used for context (knowledge base, facts about the world) and memory (running list of facts about you) will diverge. The leading implementation was MemGPT and now it seems to have rolled out in both ChatGPT (with a weirdly roon-y tweet. more details from Joanne Jang) and LangChain.
OpenAI:
LangChain:
In some sense this is just a crossing over of something the LMstudio/Sillytavern roleplay people have had for a while now. Expectation is that it will mildly improve UX but not lead to a big wow moment since the memory modeling is quite crude at the moment, not humanlike, and subject to context limits.
Table of Contents
[TOC]
PART 1: High level Discord summaries
TheBloke Discord Summary
-
Unbounded Textual Contexts: Engineers are exploring new open-source large language models like the Large World Model, which boasts coherence with contexts up to 1 million tokens. Discussions include language support, as in Cohereās
aya
model, which covers 101 languages, and challenges working with jax-based tools during model installations. -
Nurturing Erotically Programmed Role Play: The community is dissecting performances of re-quantized Miqu models like MiquMaid-v2-70B, attuned for Erotic Role Play (ERP). Emphasis was on the impact of enhanced hardware, with a jump from 0.7t/s to 2.1t/s in tokens per second while using 12GB VRAM GPUs.
-
Instruct, Optimize, Repeat: Finetuning techniques explained include using Sequential Fine-Tuning (SFT) followed by Direct Preference Optimization (DPO) as improved RLHF/PPO, detailed on page 6 of a paper. Unsloth AIās
apply_chat_template
is touted over Alpaca to train LLMs for multi-turn conversations. -
JavaScript Meets Python in AI Development: Experimentation with JSPyBridge led to successful bridging of JavaScript and Python in expanding the SillyTavern project. This included addressing Windows-specific errors, like
cpu_embedding=True
to circumvent access violation issues and integrating Python classes asynchronously into JavaScript code. -
Confounding Losses in Model Training: An engineer observed an unexplained variance in training loss when finetuning Mixtral 8x7b qlora, resulting in higher losses compared to Mistral 7b despite similar datasets and hyperparameters. The matter remains open for community input or similar experiences.
LM Studio Discord Summary
-
Large Models Crying for RAM: Users like
@nonm3
and@theoverseerjackbright
battled errors loading large models in LM Studio due to limited RAM and VRAM. Suggestions included trying smaller model quants, and some faced GPU detection issues with LM Studio, prompting restart crashes. -
MedAlpaca Heads to the Clinic: Discussions on medical LLMs saw medAlpaca, a fine-tuned model for medical question answering, as a promising addition to
@pepito92i
ās medical project. Microsoftās phi-2 modelās absence from LM Studio was noted, with the possibility of it being converted to .gguf format by user TheBloke to use with llama.cpp. -
GPU Matchmaking and Overclocking: Hardware enthusiasts like
@luxaplexx
questioned NVLinkās role in memory cycling, ultimately suspecting cards like the 960 might not be NVLinked. Users discussed GPU upgrades for better performance with models, considering options like the RTX 3060 12GB. Others like@alastair9776
and@rugg0064
weighed the benefits and risks of overclocking for faster token generation. -
Quant Leap Forward: Eager anticipation for IQ3_XSS support in LM Studio, with users like
@n8programs
and@yagilb
expecting it in the next update. A GitHub pull request reflected the communityās excitement over upcoming 1.5 bit quantization. Meanwhile, preparations were suggested for downloading forthcoming, as-yet-unsupported models like IQ3_XSS. -
Beta Release Relief:
@rafalsebastian
ran into a stumbling block running LMstudio on CPUs with only AVX support.@heyitsyorkie
provided hope by directing to the 0.2.10 AVX beta release for Windows that enables compatibility, while still recommending an upgrade to AVX2 for optimal performance and offered a helpful link. -
Multi-Model Management Mystery:
@alluring_seahorse_04960
sought advice on running dual models simultaneously on one machine to avoid repetition errors, using a Conda environment but steering clear of VMs. The nature of the repetition errors in question was humorously prodded by@wolfspyre
, awaiting further details.
LAION Discord Summary
-
Magvit V2 Sparks Interest and Debate: Engineers delved into the technicalities of reproducing the Magvit V2 model, with discussions focusing on appropriate datasets, parameters for video compression and understanding, and the mention of experiments on the lfq side of Magvit2. The community also saw a surge in interest around MAGVIT, likely due to influencer mentions.
-
Scrutinizing Stable Cascadeās Efficacy: Stability AIās Stable Cascade model spurred intense conversations regarding its high VRAM requirements, optimization issues, and erroneous inference time graphs. Technical issues reported included challenges with text clarity in images and the inability to run models in float16, alongside performance evaluations on GPUs like the 3090.
-
Legal Frays in AI-Generated Imagery: The community engaged in a heated discussion about copyrights and the legality of AI-generated images, highlighting a TorrentFreak article about a court dismissing authorsā copyright infringement claims against OpenAI.
-
Ethical Conundrums with AI and Adult Content: The conversation shifted to the role of adult content in driving technological progress, with some participants recognizing the historical pattern while others doubted its constructive impact on AI. Topics included the rise of non-consensual deepfake pornography, its market dynamics, and the potential ethical pitfalls plaguing the AI community.
-
Calls for Higher AI Image Standards: Discussions included technical insights into improving AI image generation, such as the viability of VAE encoder training. Members also reflected on the communityās photorealism standards, expressing the need for better quality in AI-generated images.
Eleuther Discord Summary
-
Checksum Hunting Season Open:
@paganpegasus
provided checksums for The Pile zst shards and pointed to EleutherAIās hashes and the Discord pins. -
Image Content Classification Tools Discussed: OwlViT and CLIP models were recommended as tools for discerning the content of images and the concept of ānothingā in imagery was discussed due to an inquiry by
@everlasting_gomjabbar
. -
Paper Review in Collaborative Spirit: A user received appreciative feedback on a manuscript titled āDonāt think about the paper,ā with the EleutherAI Discord community being credited in the paperās acknowledgements.
-
Cloud Computing Resources Examined: GCP and Colab surfaced as favorable cloud resources for NLP classification model training, with discussions encompassing cost-benefit analyses of platforms like runpod and vast.ai.
-
Research Computing Power Up for Grabs: EleutherAIās computational resources were said to be available for collaboration on a semi-custom LLM project, with the caveat of having a clear research agenda and collaborative value proposition.
-
Semantic Scholarās Linking Logic Revealed: Arxiv papers are automatically linked to authors on Semantic Scholar, with room for manual corrections to ensure accuracy.
-
Fractal Fun with Neural Training Parameters:
@jbustter
shared fractals created from neural network hyperparameters, highlighting a blog by Jascha Sohl-Dickstein that correlates fractals with training convergence/divergence. -
A Deep Dive into Data Presentation for ML: A discussion was sparked concerning active learning and methods for models to choose their own data presentation sequence.
-
Enriching Encoder-Decoder Models with Unsupervised Data: Strategies to employ unsupervised datasets effectively in encoder-decoder models were discussed.
-
New NLP Robustness Method Flies Off the Press: A paper focusing on test-time augmentation (TTA) to enhance text classifiersā robustness was published, with the author thanking the community for support.
-
The Quest for Interpretability Insight:
@jaimerv
asked for updated resources on interpretability methods beyond the standard Representation Engineering paper. -
Summoning Collaborators for Hallucination Leaderboard: A call for contributions to a hallucinations leaderboard was made, requesting assistance with tasks, datasets, metrics, and result evaluations.
-
Aligning Pythia with Practice: Concerns were aired about potential misalignments between training batches and checkpoints in the 2.8b size Pythia deduped suite, with follow-up discussions suggesting opportunities for a publication on Pythiaās reliability.
LlamaIndex Discord Summary
LlamaIndex v0.10 Marks Major Milestone: LlamaIndex v0.10 has been released, presenting notable advancements including a new llama-index-core
package and PyPi packages for every integration/template. Detailed information on migration is accessible through their comprehensive blog post and documentation.
Webinar on No-Code RAG with LlamaIndex: A webinar demonstrating the creation of no-code Retrieve and Generate (RAG) apps using LlamaIndex.TS is set up with Flowise co-founder Henry Heng. Registration for the Friday event is available here.
Troubleshooting LlamaIndex: Engineers faced challenges with migration following LlamaIndexās update and were pointed to a Notion migration guide for assistance. Furthermore, for configuration queries like chunk_size
post-ServiceContext depreciation, engineers are advised to refer to the new Settings
documentation and relevant LlamaIndex GitHub resources.
RAG App Building with Dewy Tremendously Simplified: A comprehensive guide to building a full-stack RAG app using NextJS, OpenAI, and the open-source knowledge base Dewy has been shared. The tutorial is aimed at grounding language models in precise, reliable data and can be studied in detail here.
Handling Document Complexity and Enhancing Enterprise with LlamaIndex: Users engaged in discussions about filtering complex documents and integrating LlamaIndex to enhance enterprise efficiency with tools such as Slack, Jira, and GDrive. Also, creating multiple agents for merging different document sources was considered, referencing the possibility of using traditional indexing techniques instead of high-cost LLMs for dynamic filtering.
HuggingFace Discord Summary
-
Hugging Face Accelerates with Message API: Hugging Face launched a new Message API compatible with OpenAI, aimed at streamlining use of inference endpoints and text generation services with client libraries. Theyāve also advanced their offerings with new releases like Datatrove on PyPI, Gradio 4.18.0, and tools like Nanotron and Accelerate 0.27.0 for 3D parallelism training. Additional partnerships and resources, such as a Codecademy AI course and a blog post on SegMoE, support the continuous learning and innovation in their community.
-
Search Engine Woes and Hosting Queries in Focus: Technical discussions spotlighted the difficulties in creating search engines with mentions of approaches like TF-IDF and BM25, and the use of spaCy for Part of Speech tagging. Other conversations pivoted to queries about hosting custom models and serverless inferencing solutions, as well as the practicality of running 100B+ parameter models on enthusiast-level hardware.
-
Template Talk and Model Deployment Discussions: Users addressed the need for a simple chatbot development prototype capable of database interaction and email API integration, featuring resources like Microsoftās AutoGen on GitHub and the potential of AutoGen Studio. Challenges around deploying finetuned machine learning models such as mistarl_7b_gptq for fast inferences were raised, with emphasis on choosing the right platforms or libraries for the task.
-
Glimpse into Creator Innovations: Members of the community showcased their creative projects, including GIMP 3.0 plugins interfacing with Automatic1111, development of an automated image tagging model for diffusion tasks, and updates to tools like PanoramAI.xyz introducing a āremixā mode for image transformations. Excitement built around AI-applications in fashion design as well, demonstrating the breadth of applications being pursued.
-
Analyzing S4 and Advancing NLP: The community shared their insights into the S4 architecture ideal for long-range sequences and sought clarity on its implementation. The paper on LangTest got introduced, which offers testing and augmenting capabilities for NLP models. Topics extended to extracting language identifiers from models like XLM-RoBERTa and converting natural language into formal algebraic expressions.
-
Enthusiasm for Diffusion and Emerging Models: Conversations sparked around facilitating multi-GPU training for diffusion model fine-tuning, with mentions of scripts such as
train_text_to_image.py
. The successful deployment of models like mistarl_7b_gptq for fast inference, and effective text generation with stable cascade were discussed. The buzz was palpable around the teased development of a new terminus model. -
Complications in Computer Vision Explored: The channel delved into challenges like hierarchical image classification, with resource suggestions including an ECCV22 paper on the same. Members discussed requirements for Gaussian splats, industry-grade image retrieval systems and sought collaboration on multimodal projects.
Nous Research AI Discord Summary
-
LongCorpus Dataset Unveiled for Pre-Training: The new LongCorpus-2.5B dataset is released, featuring 2.5 billion tokens from various domains, specifically curated for long-context continual pre-training and designed to minimize n-gram similarity with training sets.
-
Coherence Preserved in Scaling Models: Scaling with āself-extendā is considered superior over ārope scalingā for maintaining coherence, as indicated by the implementation in llama.cpp, and offers the benefit of requiring no setup, fine-tuning, or additional parameters.
-
Persistence and Resistance in AI Agents and Models: LangGraph agents can persist their state across interactions, as shown in a YouTube demonstration, while the Gemini model shows resistance, with its refusal tendencies prompting comparisons unfavorable to GPT-4.
-
Multimodal AI Breakthrough with Reka Flash: Reka Flash, a new 21B fast multimodal language model, is introduced and now available in public beta, promising to measure up to major models like Gemini Pro and GPT-3.5. The initiative can be followed on Reka AIās announcement page.
-
CUDA Pains and WAVeform Gains in AI Research: The ZLUDA project aimed to run CUDA on AMD GPUs can no longer be considered active, and a fresh perspective in AI research proposed in an arXiv paper, suggests wavelet transforms could enhance Transformers by addressing both positional and frequency details efficiently.
Mistral Discord Summary
-
Newbies Get Model Recommendations: Participants recommended instruct models for chat-GPT-like interactions to newcomer
@nana.wav
, with the clearer instruction-following focus as opposed to the more general autocompletion capabilities of other models. -
RAG Setup and Model Debates Heat Up: A guide on integrating Mistral with RAG was shared, while the effectiveness of LangChain vs. LlamaIndex was debated; separately, DSPy was touted for leveraging LLMs for programming rather than chatting, adorned with a supportive Twitter link.
-
Deployment Dilemmas and Solutions: Docker deployment via ollama or vllm projects was suggested, while others discussed API alternatives and faced cloud quota barriers; meanwhile, success stories involved deploying Mixtral on HuggingFace despite the hiccups with AWQ quantization.
-
Fine-Tuning Finesse and RAG Revelations: Users discussed fine-tuning vs. RAG with insights into LLM base knowledge importance; guidance was given on input data structuring for LLM output enhancement and queries about prompt versioning tools surfaced.
-
Humans in Tech and AI Seek Touchpoints: French librarian (
@maeelk
) sought internship opportunities in psychology and AI; the cost of innovatively building audio-inclusive S2S models sparked discussions around budget constraints and investment needs. -
Technical Troubles and Support Suggestions:
@ingohamm
faces hurdles with TypingMindās API key and a suggestion was made to contact [email protected] for assistance with API and subscription issues.
Perplexity AI Discord Summary
-
Perplexity AI Outshines Rivals in Complex Query Handling:
@tbrams
tested Perplexity AI with a difficult question from the āGeminiā paper and found it outperformed Googleās Gemini service and OpenAI, answering more quickly. The test results from Perplexity AI are documented here. -
Perplexityās Potential in API Customization Highlighted: The PPLX API allows for custom search queries using parameters like
"site:reddit.com OR site:youtube.com"
, as mentioned by@me.lk
. However, several users have encountered issues with the API such as performance hiccups (@andrewgazelka
) and nonsensical responses (@myadmingushwork_52332
). -
Perplexity AI Subscription and Renewal Queries Addressed: Users are seeking details on trial subscriptions and renewal processes for Pro subscriptions, with inquiries about token refresh rates also surfacing. There is currently no early access program for new Perplexity features as confirmed by
@icelavaman
. -
Promising Enhancements and Community Collaborations: Perplexity AI is receiving community praise for tools like the pplx shortcut action (
@twodogseeds
). Meanwhile,@ok.alex
is encouraging a community-driven effort to contribute to an alternative feed/newsletter Alt-D-Feed. -
Seeking Direct Support Channel for Sensitive Data Issues: A user (
@kitsuiwebster
) has expressed the need for direct assistance with a sensitive company data issue, avoiding public disclosure while lacking response from support channels.
OpenAI Discord Summary
-
ChatGPT Remembers Your Favorite Color: OpenAI announced a new memory feature for ChatGPT, rolling out to select users, enabling ChatGPT to remember user preferences and details over conversations for a more personalized experience. Users can control what ChatGPT remembers and can switch off this feature.
-
AI-Assistants in Creative Process Paid Talks: A UK researcher,
@noodles7584
, is looking to compensate community members for a 30-minute discussion on AI use in creative workflows. -
Performance Quirks in GPT Variants: The community reported fluctuations in GPT-4ās task handling, and Abacus.AIās Smaug-72B was noted for outperforming GPT-3.5, while ChatGPT-4 seems hesitant to generate full code snippets.
-
Fine-Tuning AI to Watch Videos? Not Yet: Discussion in #gpt-4-discussions clarified that while GPT can describe images from a video with its vision capabilities, it cannot yet be fine-tuned for video-specific knowledge or tasks.
-
Exploring and Perfecting Prompt Engineering: Good prompt engineering was highlighted as involving clear instructions and precision, with a focus on fostering simple storytelling in text-based AI adventures and recognizing differences between prompt engineering and API infrastructure development.
OpenAccess AI Collective (axolotl) Discord Summary
-
Axolotl Embraces MPS, Thanks to GitHub Heroes: Maxime added MPS support in the axolotl project via pull request #1264, referencing the importance of a PyTorch pull request #99272. Clarification on contributor identities highlighted the importance of collective recognition in open source.
-
Chat In The Time Of Datasets: The MessagesList standard for chat datasets proposed by
@dctanner
aims for cross-compatibility and is under discussion. The format might include conversation pairs, greetings, and assistant-initiated closures, with challenges noted in JSON-schema validation. -
Axolotl Tokenized Right, Check the Debug Flag: Users are troubleshooting tokenization within axolotl, with advice to inspect the tokenizer configs and a recommendation to use a debug flag for verification.
-
Model Query Woes and Training Queries Grow: Queries about improving modelās multilingual capabilities, LoRA adapter inferencing, and model parallelism were discussed, with solutions ranging from pre-training needs to updates in transformers and DeepSpeed Zero 3 configs for better functionality.
-
Fine-tune or Re-train? Duplicate Dataās Pain: The impact of training data overlap and finetuning practices were questioned, highlighting concerns about reusing text that a model may have encountered during pretraining.
-
RunPod Image on Vast.AI, A Smooth Sail!: The Axoltl RunPod image was reported by
@dreamgen
to work seamlessly on Vast.AI, underscoring the inter-operability with cloud infrastructure providers.
LangChain AI Discord Summary
-
LangChain Unveils Memory Journaling App:
@hwchase17
introduced a new journaling app featuring LangChain memory module, inviting feedback for the early version akin to OpenAIās ChatGPT with memory feature. Try and give feedback using this journal app and watch the intro video. -
LangChain Community Tackles Diverse Technical Challenges: Topics covered included the possibility of LangChainās Android integration, pre-processing benefits for efficient embeddings, the search for a capable PDF parser, and calls for improved documentation structure. Additionally, a user faced dependency issues while updating Pinecone Database to v2 with LangChain, which was promptly addressed.
-
Scaling and Integration Enquiries in Langserve Channel: Discussions included questions about scaling Langserve and using Langsmith for deployment. There was a query about exposing a chain from a NodeJS app and an unaddressed issue regarding disabling intermediate steps in the playground. Connection issues with an OpenAI API call from a k8s cluster-based app were also described.
-
Dewy RAG Application with NextJS and OpenAI Guide Shared:
@kerinin
contributed a guide exploring a full-stack RAG application, utilizing NextJS, OpenAI API, and Dewy, focusing on reducing hallucinations and improving model response accuracy. The full guide is available here. -
Quest for a Functional PDF Parser and Custom Calculator: Within the tutorials channel, the search for a superior contextual PDF parser to Adobe API, and guidance for building a Langchain-based calculator were topics of discussion, aiming for practical integrations and solutions in AI workflows.
DiscoResearch Discord Summary
- Seeking Argilla Hosting Solutions:
@drxd1000
requested advice for hosting an Argilla server capable of supporting multiple annotators with no clear resolution reached. - Layer Selective Rank Reduction in the Spotlight:
@johannhartmann
discussed an implementation of āLayer Selective Rank Reductionā for mitigating continual training forgetting. The method targets statistically less significant layer parts, and a GitHub repository was mentioned. - Overcoming OOM With Mixtral:
@philipmay
encountered an Out of Memory error with a mixtral model, and@bjoernp
suggested using multi-GPU support, mentioning that two A100s might alleviate the issue. - Cross-Language Toxicity Detection Dataset:
@sten6633
sought a German toxicity evaluation dataset, considering the translation of ToxiGen from Hugging Face, which requires access agreement. - New Computational Technique Teased:
@phantine
teased a technique named āUniverses in a bottleā with implications for the P=NP problem, linked to a GitHub page, but details were sparse. - BM25 Search Strategy Proves Effective:
huunguyen
reported success using BM25 with additional querying and reranking to enhance search capabilities, and successfully indexed the entirety of Wikipedia into an index under 3GB. - German AI Model Update Inquiry: thomasrenkert asked about the release timeline for version 2 of the German model or a Mixtral variant, but no additional details were provided.
CUDA MODE Discord Summary
-
CUDA Compatibility Crusade: Members discussed achieving CUDA binary compatibility on HIP/ROCm platforms, driven by the ZLUDA project on GitHub, which is a CUDA on AMD GPU initiative. Amidst technical emoji enthusiasm, there were musings about market monopolies and AGI, alongside personal experiences with Radeon hardware issues related to dynamic memory allocation.
-
Generative AI Jobs Galore: A Deep Tech Generative AI startup in Hyderabad is hiring ML, Data, Research, and SDE roles, with applications welcomed here. However, the legitimacy of the job posting was questioned, flagging the need for moderator attention.
-
Compute Shaders and Matrix Math Musings: Inquiries on educational materials for CUDA led to The Book of Shaders recommendation, while the discussion in the PMPP book channel debated the benefits, or lack thereof, of transposing matrices to reduce cache misses in multiplication, indicating varied opinions but no consensus on observed benefits.
-
Apple Chips Enter Monitoring Realm:
@marksaroufim
shared asitop, a CLI performance monitoring tool designed for Apple Silicon, likening it totop
ornvtop
in utility for engineers leveraging Appleās technology. -
GPU Experiments and Job Shuffling: An engineer successfully relocated an Asus WS motherboard to a miner setup, effectively running large quantized models on a NVIDIA 3060 GPU. This indicates a hands-on approach within the community towards custom hardware configurations.
Latent Space Discord Summary
- Reka Enters the Model Arena: A new AI entity named the Reka model has sparked interest in the community following a tweet shared by
@swyxio
. The excitement is palpable with discussions around the tweet found here. - Investor Insights Meet AI:
@swyxio
spotlighted a VC podcast delving into AI, which could be of significant interest to engineering aficionados. The podcast episode is accessible here. - BUD-E Buzz: BUD-E, an empathetic and context-aware open voice assistant developed by LAION, could signal a new direction in conversational AI. More details are laid out on the LAION blog.
- Pondering the Definition of Agents: The community exchanged views on defining āagents,ā with
@slono
suggesting that they are goal-oriented programs that require minimal input from users, a concept significant in the realm of AI development. - Karpathyās OpenAI Exit Raises Questions: The AI community is abuzz over the news of Andrej Karpathy leaving OpenAI, with
@nembal
pointing to an article from The Information and speculation about AGI influences. The article is accessible here.
LLM Perf Enthusiasts AI Discord Summary
-
Minding the Model Size for M2 Max:
@potrock
inquired about running Mistral model sizes on an M2 Max with 32GB, and@natureplayer
advised that a 4GB model would be the feasible option, cautioning against an 8GB model and noting that 5GB models may be unstable. -
GPT-5 Rumor Mill:
@res6969
expressed humorous doubt about GPT-5ās existence, suggesting that speculation on the modelās upcoming release is overstated, with others joining the jest with emojis. -
Enhanced Memory in ChatGPT:
@potrock
highlighted a new feature tested in ChatGPT, based on a blog post, where it can retain user preferences and information across sessions for more personalized interactions.
AI Engineer Foundation Discord Summary
- Weekly Sync-Up Teases Déjà vu:
@._z
playfully announces the start of the weekly team meeting likening it to a recurring Déjà vu experience. - Member Bows Out from Meeting:
@juanreds
sends regrets for being unable to attend the weekly meeting, offering apologies to the team. - Call for AI Hackathon Co-Hosts:
@caramelchameleon
seeks collaborators to co-host an AI developers hackathon in tandem with game developers in the lead-up to the GDC. - Hackathon Offers Dual Attendance Modes: The hackathon mentioned by
@caramelchameleon
has options for attendance both online and onsite in San Francisco. - Hackathon Organizer Steps Up:
@yikesawjeez
shows eagerness to get involved in organizing the hackathon and highlights their expertise with Bay Area events.
Skunkworks AI Discord Summary
- Direct Messaging Initiated: User
@bondconnery
has put out a request for a private message. - Exploring LLaVA Framework Integration:
@CodeMan
inquired about integrating the LLaVA framework with an SGLang server and SGLang worker, aiming for a potentially more specialized setup than a conventional model worker. - Off-Topic Video Share Ignored: A non-technical video link was shared, not relevant to the engineering discussions.
The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The Datasette - LLM (@SimonW) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
PART 2: Detailed by-Channel summaries and links
TheBloke ā· #general (1460 messagesš„š„š„):
- Exploring the Limits of Large Language Models: Users are discussing new open-source large language models capable of handling extremely long contexts, such as the Large World Model which claims to work coherently with contexts up to 1 million tokens. There are also mentions of the Cohereās
aya
model that supports 101 languages. - The Quest for Efficient Multimodal AIs: Conversations focus on multimodality in AI with references to models handling visual inputs and potential outputs, indicating significant advancements beyond text-based models. The jax-based tools required to run the models are causing installation hiccups for some users.
- Models Under Scrutiny: The community is very active in testing released models, mentioning issues such as TUX dependency problems and a
ValueError
during setup, indicating some challenges in getting the advanced models running smoothly. - Users Share Knowledge: Experienced users offer insights and assistance on how to handle models and UIs for various tasks, including long-context quantization in existing frameworks like ExLlama v2. Discussions also touch on the possibility of banishing stop tokens to encourage longer continuous outputs.
- Towards Intelligent Role-Playing: There is a discussion on finding the balance between RP-oriented models and smarter generalized ones, with mentions of a Mixtral variant (
BagelMIsteryTour
) that might better fulfill user requirements for intelligent and adaptable model behavior.
Links mentioned:
- Context ā share whatever you see with others in seconds: no description found
- Lil Yachty Drake GIF - Lil Yachty Drake - Discover & Share GIFs: Click to view the GIF
- Memory and new controls for ChatGPT: Weāre testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. Youāre in control of ChatGPTās memory.
- brucethemoose/LargeWorldModel_LWM-Text-Chat-128K-55bpw Ā· Hugging Face: no description found
- YOLO: Real-Time Object Detection: no description found
- Kooten/BagelMIsteryTour-v2-8x7B-5bpw-exl2 Ā· Hugging Face: no description found
- no title found: no description found
- Think Bigger Skeletor GIF - Think Bigger Skeletor Masters Of The Universe Revelation - Discover & Share GIFs: Click to view the GIF
- no title found: no description found
- CausalLM/34b-beta Ā· Hugging Face: no description found
- SimSim93/CausalLM-34b-beta_q8 Ā· Hugging Face: no description found
- GitHub - jy-yuan/KIVI: KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache: KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache - jy-yuan/KIVI
- The Verge: How NOT to Build a Computer: SPONSOR: Go to http://expressvpn.com/science to take back your Internet privacy TODAY and find out how you can get 3 months free.Link to the Vergeās awful viā¦
- LWM/lwm/llama.py at main Ā· LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
- https://drive.google.com/drive/folders/1my-8wOIYXmfnlryDbwJ20_y6PFCqRfA-?usp=sharinghttps://drive.google.com/drive/folders/1my-8wOIYXmfnlryDbwJ20_y6PFCqRfA-?usp=sharingData Challenge - Aether 2024: In order to participate in the Data Challenge organised by Enigma as part of Aether, Please fill out this form Event Date & Time: Wednesday, February 14th - 2:30 pm Please double-check your detailā¦
- LargeWorldModel (Large World Model): no description found
- GitHub - lhao499/tux: Tools and Utils for Experiments (TUX): Tools and Utils for Experiments (TUX). Contribute to lhao499/tux development by creating an account on GitHub.
- GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.: A multimodal, function calling powered LLM webui. - GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.
- llama.cpp/examples/server at master Ā· ggerganov/llama.cpp: Port of Facebookās LLaMA model in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.
- Tweet from Cohere For AI (@CohereForAI): Today, weāre launching Aya, a new open-source, massively multilingual LLM & dataset to help support under-represented languages. Aya outperforms existing open-source models and covers 101 different laā¦
- OpenAI Researcher Andrej Karpathy Departs: Andrej Karpathy, one of the founding members of OpenAI, has left the company, a spokesperson confirmed. Karpathy, a prominent artificial intelligence researcher, was developing a product he has descriā¦
- CohereForAI/aya-101 Ā· Hugging Face: no description found
- GitHub - valine/NeuralFlow: Contribute to valine/NeuralFlow development by creating an account on GitHub.
- ChatGPT but Uncensored and Free! | Oogabooga LLM Tutorial: ChatGPT but uncensored and free, well its now possible thanks to the open source AI community! In this video I show you how to set up the Oogabooga graphicalā¦
- LWM/lwm/vision_chat.py at main Ā· LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
- New emails reveal scientists believed COVID-19 was man-made: New emails have revealed scientists got together to discuss the origins of COVID, suspecting it was man-made, before deciding to tell the public it originateā¦
- GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.: A multimodal, function calling powered LLM webui. - GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.
- GitHub - LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
- GitHub - acorn-io/rubra: AI Assistants, LLMs and tools made easy: AI Assistants, LLMs and tools made easy. Contribute to acorn-io/rubra development by creating an account on GitHub.
- unalignment/weeeeee.0 Ā· Hugging Face: no description found
- unalignment/weeeeee.1 Ā· Hugging Face: no description found
- unalignment/weeeeee.2 Ā· Hugging Face: no description found
- CohereForAI/aya_dataset Ā· Datasets at Hugging Face: no description found
- CohereForAI/aya_collection Ā· Datasets at Hugging Face: no description found
TheBloke ā· #characters-roleplay-stories (154 messagesš„š„):
-
Exploring Miqu Variants:
@superking__
opened a discussion about the performance of the Miqu models, particularly after being re-quants from the original GGUFs (Googleās Generative Unsupervised Feature extraction).@soufflespethuman
mentioned MiquMaid-v2-70B, a variant specifically fine-tuned for ERP (Erotic Role Play), and provided sensitive content links to various versions on Hugging Face, which have been marked due to their nature. -
Performance Gain with Better Hardware:
@superking__
shared their experience on performance improvement from āpainfully slowā to āalmost usableā by upgrading their hardware to 12GB VRAM, which changed the given modelās tokens per second from 0.7t/s to 2.1t/s. -
Model Comparisons and Recommendations: In the context of roleplay and storytelling, users discussed various models.
@spottyluck
praised Nous Capybara Limarpv3 34B for its capabilities and provided a link to the model on Hugging Face.@wolfsauge
shared a sketch about āThe Continentalā featuring Christopher Walken and@eqobaba
inquired about appropriate models and settings for engaging in NSFW ERP, mentioning a specification of 48GB VRAM and RTX A600. -
Discussing Model Output Improvement:
@neriss
suggested using a higher temperature or lower minimum probability to reduce repetition and improve creativity in AI model outputs. The conversation highlighted variations in temperature settings, with@dercheinz
suggesting higher temperatures, while@neriss
advised lower ones, each to counteract repetitive or uncreative responses from models. -
Dataset Cleaning Challenges and Strategies:
@c.gato
and@potatooff
exchanged thoughts on cleaning datasets manually, with@c.gato
seeking advice on how to perform ngram analysis to prevent overtraining on specific ngrams.@mrdragonfox
recommended using Pythonās pandas library for handling tabular or JSON data, sharing a gist for guidance.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- TheBloke/Nous-Capybara-limarpv3-34B-GGUF Ā· Hugging Face: no description found
- Models - Hugging Face: no description found
- gist:f786564868357cde5894ef6e2c6f64cf: GitHub Gist: instantly share code, notes, and snippets.
- The Continental: Anticipation - Saturday Night Live: Subscribe to SaturdayNightLive: http://j.mp/1bjU39dSEASON 26: http://j.mp/14GYJ6nThe night air is tinged with anticipation. Itās time to meet The Continentalā¦
- Happy Fun Ball - SNL: Happy Fun Ball seems great until you hear all the potential side effects. [Season 16, 1991]#SNLSubscribe to SNL:Ā https://goo.gl/tUsXwMStream Current Full Epiā¦
- NeverSleep/MiquMaid-v2-70B Ā· Hugging Face: no description found
- NeverSleep/MiquMaid-v2-70B-GGUF Ā· Hugging Face: no description found
- NeverSleep/MiquMaid-v2-70B-DPO Ā· Hugging Face: no description found
- NeverSleep/MiquMaid-v2-70B-DPO-GGUF Ā· Hugging Face: no description found
TheBloke ā· #training-and-fine-tuning (43 messagesš„):
-
Understanding Finetuning Techniques:
@starsupernova
explained that Mixtral ā Instruct was trained using SFT on an instruction dataset followed by Direct Preference Optimization (DPO) on a paired feedback dataset, as detailed on page 6 of their paper. DPO is described as an optimized form of RLHF/PPO finetuning. -
Unsloth AIās Apply Chat Template:
@starsupernova
, likely the founder of Unsloth AI, highlighted the use ofapply_chat_template
instead of Alpaca for training an LLM on multi-turn conversation datasets. They also hinted at uploading a new notebook with all chat templates to simplify the process. -
Augmentoolkit for Instruct-Tuning Datasets: In the conversation,
@mr.userbox020
shared a link to a GitHub repository offering a toolkit to convert Compute and Books Into Instruct-Tuning Datasets. Although@starsupernova
was not familiar with it, they suggested trying it out as it appeared promising. -
Anticipation for Updated Training Resources:
@avinierdc
is awaiting an updated notebook from@starsupernova
for fine-tuning Mistral on multi-turn conversation datasets.@starsupernova
assured they would ping when itās available on the Unslothās Discord server. -
Unexplained Variation in Training Loss:
@dreamgen
reported observing a 2x higher training and evaluation loss when fine-tuning Mixtral 8x7b qlora compared to Mistral 7b, despite using the same dataset and similar hyperparameters, and inquired if others had seen something similar.
Links mentioned:
GitHub - e-p-armstrong/augmentoolkit: Convert Compute And Books Into Instruct-Tuning Datasets: Convert Compute And Books Into Instruct-Tuning Datasets - e-p-armstrong/augmentoolkit
TheBloke ā· #coding (8 messagesš„):
-
Interest in Collaboration Sparked:
@_b_i_s_c_u_i_t_s_
expressed interest in an unspecified topic, potentially around chatbot implementation, which was well received by@mr_pebble
, finding it motivating to progress on implementing various chat methods. -
Bridging JavaScript and Python:
@spottyluck
experimented with expanding the SillyTavern project to use a JavaScript-Python bridge, utilizing JSPyBridge to potentially adapt and enhance functionalities. They shared how it enabled testing of Microsoftās LLMLingua, despite some issues with prompt mangling. -
Using Python Classes in JavaScript:
@spottyluck
provided code examples illustrating the ease of creating Python classes within JavaScript using JSPyBridge, along with an asynchronous function,compressPrompt
, which demonstrates the interaction between languages to compress prompts. -
Modifications to Handle Windows Errors and Devices: In their continued development,
@spottyluck
modified Intelās BigDL.llm transformer to support specific requirements, likecpu_embedding=True
on Windows due to access violation errors, and dealing with model device allocation issues usingmodel.to()
. -
Compression Process Integration into Routing:
@spottyluck
explained integrating prompt compression into their web service by adding a flag to the/generate
router post and using conditional logic to process the prompt through the bridge, demonstrating how Python can operate as if it were a JavaScript class.
Links mentioned:
GitHub - extremeheat/JSPyBridge: š. Bridge to interoperate Node.js and Python: š. Bridge to interoperate Node.js and Python . Contribute to extremeheat/JSPyBridge development by creating an account on GitHub.
LM Studio ā· #š¬-general (202 messagesš„š„):
-
Struggles with Large Models: Users like
@nonm3
encountered errors while loading large models in LM Studio due to insufficient RAM and VRAM, with suggestions to try smaller model quants. Others like@theoverseerjackbright
faced issues with LM Studio not detecting GPUs correctly and crashing post-restart. -
Software Seekers and Recommendations:
@tvb1199
was in search of client software that can interact with LM Studio for RAG capabilities, and was pointed towards AGiXT, while@pierrunoyt
and others discussed Nvidiaās āChat with RTXā with RAG features as a potential game-changer. -
Compatibility Inquiries: Several users such as
@wizzy09
had trouble installing or opening LLM Studio on unsupported platforms like a 2014 MacBook Pro, with clarifications from users like@heyitsyorkie
explaining that LMStudio does not work on Intel Macs. -
Nvidiaās Chat with RTX Triggers Interest: The community showed a keen interest in Nvidiaās āChat with RTXā. Users like
@hypocritipus
were intrigued by the RAG feature, hoping for a similar easy-install, no-dependency RAG feature in LM Studio. -
LM Studio Usage and Model Discussions: Users like
@bigboimarkus
expressed satisfaction with LM Studio for tasks such as proofreading, whereas@mr.stark_
queried about models that learn on the fly. Conversations included the functionality and integration with other tools like Ollama and Automatic1111. -
General Community Assistance and Banter: Throughout, community members engaged in sharing tips, offering troubleshooting advice, including suggestions for alternatives or downgrading versions, and occasionally joked about AI capabilities such as predicting lottery numbers.
Links mentioned:
- Stable Cascade - a Hugging Face Space by multimodalart: no description found
- cmp-nct/Yi-VL-6B-GGUF at main: no description found
- TheBloke/CodeLlama-70B-Instruct-GGUF at main: no description found
- Chost Machine GIF - Chost Machine Ai - Discover & Share GIFs: Click to view the GIF
- System prompt - Pastebin.com: Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
- NVIDIA Chat With RTX: Your Personalized AI Chatbot.
- The unofficial LMStudio FAQ!: Welcome to the unofficial LMStudio FAQ. Here you will find answers to the most commonly asked questions that we get on the LMStudio Discord. (This FAQ is community managed). LMStudio is a free closedā¦
- The unofficial LMStudio FAQ!: Welcome to the unofficial LMStudio FAQ. Here you will find answers to the most commonly asked questions that we get on the LMStudio Discord. (This FAQ is community managed). LMStudio is a free closedā¦
LM Studio ā· #š¤-models-discussion-chat (75 messagesš„š„):
- MedAlpaca for Medical LLMs: User
@heyitsyorkie
suggested medAlpaca, a model fine-tuned for medical question answering, for@pepito92i
ās project on LLMs in the medical field. - Phi-2 Model Discussions:
@.dochoss333
inquired about the absence of the official āmicrosoft/phi-2ā model in LM Studio, and@heyitsyorkie
clarified that itās not a GGUF model and thus wonāt show up.@hugocapstagiaire_54167
mentioned user TheBloke might have transformed it into a .gguf for usability with llama.cpp. - LLama.cpp and Model Support:
@jedd1
puzzled over why some models wouldnāt load, and@heyitsyorkie
pointed out that the Yi-VL models are unsupported in the current build of llama.cpp, requiring an update for compatibility. - LM Studio Assistant Functionality Inquiry: User
@edu0835
inquired about the possibility of creating an assistant in LM Studio with the ability to utilize PDFs or books for a medical assistant application, without a direct response provided at this time. - Model Performance Comparisons Engage Community: Users like
@kujila
and@heyitsyorkie
engaged in comparisons between different language models, with discussions on model specificity, ethical behavior of AI models, and suggestions to try out models like Deepseek Coder Ins 33B.
Links mentioned:
- Nexesenex/Senku-70b-iMat.GGUF at main: no description found
- Hi Everybody Simpsons GIF - Hi Everybody Simpsons Wave - Discover & Share GIFs: Click to view the GIF
- TheBloke/medicine-chat-GGUF Ā· Hugging Face: no description found
- wolfram/miquliz-120b-v2.0-GGUF Ā· Hugging Face: no description found
- GitHub - kbressem/medAlpaca: LLM finetuned for medical question answering: LLM finetuned for medical question answering. Contribute to kbressem/medAlpaca development by creating an account on GitHub.
- The new Yi-VL-6B and 34B multimodals ( inferenced on llama.cpp, results here ) Ā· ggerganov/llama.cpp Ā· Discussion #5092: Well, their benchmarks claim they are almost at GPT4V level, beating everything else by a mile. They also claim that CovVLM is one of the worst (and itās actually the best next to GPT4, by far) Onā¦
LM Studio ā· #š-hardware-discussion (140 messagesš„š„):
-
NVLink and Memory Cycling Queries:
@luxaplexx
asked if GPUs were NVLinked and how memory cycles through in a multi-GPU setup. The consensus, including from@heyitsyorkie
, is that they likely arenāt NVLinked due to potential CUDA issues, especially with older cards like the 960. Users are considering whether different generations of NVIDIA GPUs like the 1080 and 1060 6g can work together effectively. -
Discussions on Upgrading to Better GPUs: Several users, including
@crsongbirb
and@heyitsyorkie
, discussed upgrading their GPUs for improved performance in LLM tasks, with a suggestion to look at the RTX 3060 12GB as a viable option for running LLMs locally. -
Risks and Rewards of Overclocking: In a discussion initiated by
@alastair9776
about overclocking for better performance,@rugg0064
and@crsongbirb
noted that overclocking VRAM/RAM can lead to a notable increase in token generation speed, although caution is advised due to potential hardware stress. -
Combining GPUs and Threadripper Dreams: Conversation ensued about the feasibility and costs of using multiple high-performance GPUs, with users like
@nink1
and@quickdive.
debating if a beefy CPU is necessary when having multiple powerful GPUs, and the logistics of housing such a setup. -
CUDA on AMD and Other Hardware Convos:
@666siegfried666
shared news about the ZLUDA project allowing CUDA apps to run on AMD hardware and this sparked a brief discussion on the relevance and future potential of such a feature. Users such as@addressofreturnaddress
and@joelthebuilder
also discussed their own rig setups and potential upgrades, highlighting personal preferences and value assessments.
Links mentioned:
- Doja Cat GIF - Doja Cat Star - Discover & Share GIFs: Click to view the GIF
- Brexit British GIF - Brexit British Pirate - Discover & Share GIFs: Click to view the GIF
- ATOM Echo Smart Speaker Development Kit: ATOM ECHO is a programmable smart speaker.This eps32 AIoT Development Kit has a microphone and speaker for AI voice interaction light and small. It can be access AWS, Baidu, ESPHome and Home Assistantā¦
- Unmodified NVIDIA CUDA apps can now run on AMD GPUs thanks to ZLUDA - VideoCardz.com: ZLUDA enables CUDA apps on ROCm platform, no code changes required AMD-backed ZLUDA project can now enable code written in NVIDIA CUDA to run natively on AMD hardware.Ā AMD has reportedly taken over tā¦
- Lian-Li O11 Dynamic XL ROG certificated -Black color Tempered Glass: Lian Li O11 Dynamic XL ROG certificated, Front and Left Tempered Glass, E-ATX, ATX Full Tower Gaming Computer Case - Black
LM Studio ā· #š§Ŗ-beta-releases-chat (21 messagesš„):
- Awaiting the Next Update for IQ3_XSS Support:
@n8programs
inquired about IQ3_XSS support in the latest release, to which@yagilb
responded that it will be included in the next update. - Elevation of 1bit Quantization on the Horizon:
@drawless111
shared excitement about upcoming 1.5 bit quantization, posting a GitHub pull request link indicating progress. This elicits reactions with@heyitsyorkie
anticipating a sweet next beta with the new quant sizes. - Model Benchmarking Induces Awe:
@drawless111
expressed amazement at the latest benchmarks for 1bit quantization, stating ā70B model on 16 GB card. WOOF.ā and pointing out a ā70Bā model posted on Hugging Face that can offload on VRAM effectively. - Preparations for Incompatible Model Downloads: Users, including
@epicureus
, are advised to download models like IQ3_XSS even if theyāre not supported yet, with@fabguy
humorously suggesting āSave the model, save the world!ā - Hugging Face Hub Features Multiple New Models:
@drawless111
shared an update, revealing the availability of 5 IQ1 models on Hugging Face that work with various VRAM sizes, nonchalantly noting an increase to 10 by the end of the conversation.
Links mentioned:
- Nexesenex/NousResearch_Yarn-Llama-2-70b-32k-iMat.GGUF Ā· Hugging Face: no description found
- Claire Bennet Heroes GIF - Claire Bennet Heroes Smile - Discover & Share GIFs: Click to view the GIF
- 1.5 bit quantization by ikawrakow Ā· Pull Request #5453 Ā· ggerganov/llama.cpp: This draft PR is a WIP that demonstrates 1.5 bits-per-weight (bpw) quantization. Only CUDA works, there is no implementation for the other supported back-ends. CUDA, AVX2 and ARM_NEON are implementā¦
LM Studio ā· #avx-beta (2 messages):
- No AVX2, No Cry:
@rafalsebastian
expressed concerns about not being able to run LMstudio on CPUs with only AVX (version one) after getting the message that their processor doesnāt support AVX2. They wondered if they should switch machines for running local LLMs. - LM Studio Beta for the Rescue:
@heyitsyorkie
responded with a solution, mentioning that LM Studio can indeed run on CPUs with only AVX support by downloading the 0.2.10 AVX beta release for Windows. They also recommended upgrading to a CPU with AVX2 for optimal results and provided a link to beta releases and terms of use.
Links mentioned:
LM Studio Beta Releases: no description found
LM Studio ā· #crew-ai (3 messages):
- Looking for Dual Model Deployment Tips:
@alluring_seahorse_04960
wonders how to run two models on the same machine without facing repetition errors. The user mentions using a Conda environment on Ubuntu and avoids VMs for their slowness. - Humorous Clarification Request on Repetition: In response to
@alluring_seahorse_04960
,@wolfspyre
jokes about the nature of the repetition errors, questioning whether they pertain to looping outputs or tasking issues within worker processes.
LAION ā· #general (361 messagesš„š„):
-
Magvit V2 Reproduction Inquiries:
@.lostneko
sought technical guidance for reproducing Magvit V2. Discussions circled around the ideal datasets and parameters for video compression and understanding, with@chad_in_the_house
mentioning experiments on the lfq side of Magvit2. -
Mysterious Buzz around Magvit:
@pseudoterminalx
and others in the chat noticed sudden interest in MAGVIT, speculating about a recent influencer mention given the two mentions within a short time frame. -
Stable Cascade Discussions Heat Up: Focus shifted to Stability AIās Stable Cascade model, with dialogues highlighting its hefty VRAM requirements, misleading inference time graphs, and concerns about the model being poorly optimized and full of bugs.
@pseudoterminalx
shared examples of its capabilities, including issues with text clarity in image outputs. -
Evaluating AI Models and Copyright Concerns: Conversations touched on the usage and legality of AI-generated images. Users
@vrus0188
and@kenjiqq
debated AI image model copyrights, commercial use, and the implications of research-only model licenses. -
Hardware and Performance Perspectives: A technical dialogue ensued over Stable Cascadeās heavy VRAM use and optimization problems, as
@pseudoterminalx
reported issues like inability to run models in float16 and@kenjiqq
provided details about inference time on consumer GPUs like the 3090.
Links mentioned:
- Stable Cascade - a Hugging Face Space by multimodalart: no description found
- Court Dismisses Authorsā Copyright Infringement Claims Against OpenAI * TorrentFreak: no description found
- Stable Cascade ć®ćē“¹ä» ā Stability AI Japan ā Stability AI Japan: Stable Cascade ć®ē ē©¶ćć¬ćć„ć¼ćéå§ććć¾ććććć®é©ę°ēćŖććć¹ćććē»åćøć®ć¢ćć«ćÆćåč³Ŗćęč»ę§ć微調ę“ćå¹ēę§ć®ę°ćććć³ććć¼ćÆćčØå®ćććć¼ćć¦ć§ć¢ć®éå£ćććć«ęé¤ććććØć«éē¹ćē½®ćććčå³ę·±ć3ꮵéć®ć¢ććć¼ććå°å „ćć¦ćć¾ćć
- Hey Hindi GIF - Hey Hindi Bollywood - Discover & Share GIFs: Click to view the GIF
- Donāt ask to ask, just ask: no description found
- GitHub - Stability-AI/StableCascade: Contribute to Stability-AI/StableCascade development by creating an account on GitHub.
- Crypto Kids Poster | 24posters | Hip Hop & Street Art Prints: Transform your walls with our viral new Crypto Kids Poster. Inspired by street-wear & hip hop culture, enjoy artwork designed to bring you bedroom to life. Fast shipping times (3-5 days) 10,000+ hā¦
LAION ā· #research (48 messagesš„):
-
Discussion on Impact of Adult Content on AI:
@vrus0188
and others discuss the historical contributions of adult content to advancing technology, juxtaposing it with AI developments. Some users like@twoabove
acknowledge the pattern of adult industries driving tech advancements, while others like@SegmentationFault
doubt if the focus on adult content leads to meaningful progress in AI. -
Concern Over Explicit AI-Generated Content:
@thejonasbrothers
shares a news article highlighting the misuse of AI in creating non-consensual pornography, noting the challenges it poses and its high visibility. This leads to a discussion on the broader implications and controversies surrounding AIās use in adult content. -
Observations on the Pornography Market and AI: Users like
@chad_in_the_house
and@freon
discuss the profitability and market saturation of NSFW content, contemplating the economical and ethical risks involved in this space. -
Debates Over the Merits of AI-Powered Erotic Roleplay:
@SegmentationFault
expresses frustration over the preference for low-effort erotic content in AI communities, arguing that this hinders meaningful developments in AI models. Others like@mfcool
and@.undeleted
echo these sentiments, criticizing the quality stagnation in AI-generated adult imagery. -
Technical Discussion on AI Image Quality:
@drhead
delves into technical aspects of AI-generated images, mentioning the NovelAI model and discussing the viability and impact of VAE encoder training for improved image generation. There is a communal reflection on the standards of āphotorealismā within the community and how they could be improved.
Links mentioned:
- Reddit - Dive into anything: no description found
- Reddit - Dive into anything: no description found
- AI brings deepfake pornography to the masses, as Canadian laws play catch-up: Underage Canadian high school girls are targeted using AI to create fake explicit photos that spread online. Google searches bring up multiple free websites capable of āundressingā women in ā¦
Eleuther ā· #general (179 messagesš„š„):
-
Checksums for The Pile Data Located:
@paganpegasus
provided@hailey_schoelkopf
with the checksums for The Pile zst shards, linking both the Discord pins and the EleutherAIās hashes. -
Tools to Determine Image Content:
@everlasting_gomjabbar
inquired about tools to discern if an image is of an object/location versus ānothingā like a blurry shot.@paganpegasus
described the complexity of defining ānothingā in images, while@rallio.
recommended using models like OwlViT or CLIP. -
Manuscript Review and Editing in Progress:
@wonkothesensible
, through a series of messages, provided meticulous feedback on a paper draft provisionally titled āDonāt think about the paperā, focusing on clarifying language and grammar.@hailey_schoelkopf
expressed gratitude and indicated credits to the EleutherAI Discord in the paperās acknowledgements. -
Cloud Resources for NLP Classification Discussed: In response to
@pxxxl
seeking advice on cloud resources for training NLP classification models,@ad8e
recommended GCP and Colab, with various participants chiming in about the costs and features of various platforms like runpod and vast.ai. -
Inquiries About EleutherAI Computing Resources: User
@vidava
asked about the guidelines and requirements for accessing EleutherAIās computational resources for a semi-custom LLM project featuring architectural adjustments and fine-tuning adapters.@stellaathena
indicated openness to collaboration but highlighted the need for clarity on the research agenda and proposed a collaborative value proposition. -
Semantic Scholar Paper-Author Linking Mechanism: Regarding whether Semantic Scholar automatically links Arxiv papers to authors,
_inox
clarified that the process is automatic but allows for manual intervention or suggested changes if errors occur.
Links mentioned:
- Overleaf, Online LaTeX Editor: An online LaTeX editor thatās easy to use. No installation, real-time collaboration, version control, hundreds of LaTeX templates, and more.
- Research Paper Release Checklist : no description found
- lora_example.py: lora_example.py. GitHub Gist: instantly share code, notes, and snippets.
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Hashes ā EleutherAI: no description found
Eleuther ā· #research (208 messagesš„š„):
-
Fractal Analysis of Neural Network Hyperparameters:
@jbustter
shared visualizations of fractals generated from neural network hyperparameters, with red indicating diverging training and blue for converging. Jascha Sohl-Dicksteinās blog post showcases the concept, correlating fractal patterns with the learning rates of network layers and the networkās weight offset. -
Discussing Convergence and Divergence in Training: The conversation, involving users like
@Hawk
,@genetyx8
, and@mrgonao
, discussed the means for determining if neural network training is converging or diverging, debating the presence of ādiverging to infinityā and the nature of boundaries within fractal visualizations, with suggestions that NaNs may denote divergence. -
Active Learning and Data Presentation Order in ML:
@rybchuk
inquired about research on models choosing the order of data presentation, leading to a discussion about active learning.@thatspysaspy
mentioned the subfieldās existence, noting its lack of success, and@catboy_slim_
added that it could halve training requirements by using smaller models to filter data for larger modelsā training. -
Leveraging Unsupervised Data in Encoder-Decoder Models: The question of how to utilize large unsupervised datasets effectively in encoder-decoder models for tasks such as audio to text was brought up by
@loubb
. Suggestions and discussions ranged from training components separately to integrating cross-attention during pre-training. -
Release of an NLP Robustness Paper and Test-Time Augmentation:
@millander
announced the publication of their lead author paper on improving text classifiersā robustness through test-time augmentation (TTA) using large language models. They thanked the community for support and shared the arxiv link to their work.
Links mentioned:
- Neural network training makes beautiful fractals: This blog is intended to be a place to share ideas and results that are too weird, incomplete, or off-topic to turn into an academic paper, but that I think may be important. Let me know what you thinā¦
- A Poster for Neural Circuit Diagrams: As some of you might know, I have been working on neural circuit diagrams over the past year or so. These diagrams solve a lingering challenge in deep learning research ā clearly and accurately communā¦
- MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts: State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers. At the same time, Mixture of Experts (MoE) has significantly imā¦
- Scaling Laws for Fine-Grained Mixture of Experts: Mixture of Experts (MoE) models have emerged as a primary solution for reducing the computational cost of Large Language Models. In this work, we analyze their scaling properties, incorporating an expā¦
- Model Editing with Canonical Examples: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and ā¦
- An Exponential Learning Rate Schedule for Deep Learning: Intriguing empirical evidence exists that deep learning can work well with exoticschedules for varying the learning rate. This paper suggests that the phenomenon may be due to Batch Normalization or Bā¦
- Nonlinear computation in deep linear networks: no description found
- Feedback Loops With Language Models Drive In-Context Reward Hacking: Language models influence the external world: they query APIs that read and write to web pages, generate content that shapes human behavior, and run system commands as autonomous agents. These interacā¦
- Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks: State-space models (SSMs), such as Mamba Gu & Dao (2034), have been proposed as alternatives to Transformer networks in language modeling, by incorporating gating, convolutions, and input-dependenā¦
- Suppressing Pink Elephants with Direct Principle Feedback: Existing methods for controlling language models, such as RLHF and Constitutional AI, involve determining which LLM behaviors are desirable and training them into a language model. However, in many caā¦
- Improving Black-box Robustness with In-Context Rewriting: Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings ā¦
- Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models: Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of newā¦
- Tweet from Nature Reviews Physics (@NatRevPhys): Perspective: Generative learning for nonlinear dynamics By @wgilpin0 @TexasScience https://rdcu.be/dysiB
- Tweet from Hannes StƤrk (@HannesStaerk): Diffusion models are dead - long live joint conditional flow matching! š Tomorrow @AlexanderTong7 presents his āImproving and generalizing flow-based generative models with minibatch optimal tranā¦
- A weight matrix in a neural network tries to break symmetry and fails.: We initialize a neural network so that the weight matrices can be nearly factorized as the Kronecker product of a random matrix and the matrix where all of tā¦
- Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation: Despite the promise of Mixture of Experts (MoE) models in increasing parameter counts of Transformer models while maintaining training and inference costs, their application carries notable drawbacksā¦
- llm-random/research/conditional/moe_layers/expert_choice.py at ad41b940c3fbf004a1230c1686502fd3a3a79032 Ā· llm-random/llm-random: Contribute to llm-random/llm-random development by creating an account on GitHub.
- An Emulator for Fine-Tuning Large Language Models using Small Language Models: Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a pre-training stage that uses a very large, diverse dataset of text and a fine-tuning (sometimes, &#ā¦
- MASS: Masked Sequence to Sequence Pre-training for Language Generation: Pre-training and fine-tuning, e.g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasksā¦
- Meta- (out-of-context) learning in neural networks: Brown et al. (2020) famously introduced the phenomenon of in-context learning in large language models (LLMs). We establish the existence of a phenomenon we call meta-out-of-context learning (meta-OCLā¦
- Secret Collusion Among Generative AI Agents: Recent capability increases in large language models (LLMs) open up applications in which teams of communicating generative AI agents solve joint tasks. This poses privacy and security challenges concā¦
- Portal: Home of the TechBio community. Tune into our weekly reading groups (M2D2, LoGG, CARE), read community blogs, and join the discussion forum.
- Generative learning for nonlinear dynamics | Nature Reviews Physics: no description found
- Policy Improvement using Language Feedback Models: We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following. To traiā¦
- To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis: Recent research has highlighted the importance of dataset size in scaling language models. However, large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality textā¦
- UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units - Meta Research: We present a novel two-pass direct S2ST architecture, UnitY, which first generates textual representations and predicts discrete acoustic units subsequently.
Eleuther ā· #interpretability-general (1 messages):
- In Search of Interpretability Guidance:
@jaimerv
reached out to the channel asking for a more current overview of approaches to interpretability than the paper they referenced on Representation Engineering. They are seeking assistance for potentially better or newer resources on the topic.
Eleuther ā· #lm-thunderdome (4 messages):
- Contributors Wanted for Hallucinations Leaderboard:
@pminervini
shared a call to action for contributions to the hallucinations leaderboard, adding that there are several new hallucination-oriented tasks to work on within the Harness leaderboard space. - Enthusiastic Response to Collaboration: Following the announcement,
@baber_
expressed interest and asked what specific help was needed. - Call for Specific Assistance: In response,
@pminervini
mentioned they need help with task definitions, proposing/adding new datasets and metrics, and assistance in determining which results to re-compute following recent updates to the harness.
Eleuther ā· #gpt-neox-dev (4 messages):
- Potential Misalignment in Pythia Deduped Data:
@pietrolesci
has raised concerns about a possible misalignment between training data batches and checkpoints specifically for the 2.8b size Pythia deduped suite. Other models, including the smaller versions and 6.9b, seem well-aligned. - Response to Data Alignment Query:
@hailey_schoelkopf
acknowledged@pietrolesci
ās query about the alignment issue and stated they will follow up on this matter. - Interest in Pythia Research and Suggestion for Publication:
@stellaathena
expressed excitement about the potential for a blog post or workshop paper demonstrating the reliability of Pythia, which they would extensively cite. - Openness to Writing About Pythia: In response to
@stellaathena
,@pietrolesci
appreciated the suggestion about creating a post regarding their findings on Pythia, considering it a good short project post-ACL deadline.
LlamaIndex ā· #announcements (2 messages):
- LlamaIndex v0.10 Released:
@jerryjliu0
announced the release of LlamaIndex v0.10, which is the most significant update to date, featuring a newllama-index-core
package and splitting integrations/templates into separate PyPi packages. Thellamahub.ai
is also being revamped, theyāve deprecated ServiceContext for better developer experience, and encourage the community to explore the blog post and documentation for detailed info on migration and contributing. - Celebrating Team Achievement: Big thanks were given to
<@334536717648265216>
and<@908844510807728140>
for leading the effort on the latest LlamaIndex update, which is a step towards making it a production-ready data framework. - Tweet about LlamaIndex v0.10 Launch: LlamaIndex shared a tweet highlighting key updates in LlamaIndex v0.10, including the creation of hundreds of separate PyPi packages, the refactoring of LlamaHub, and the deprecation of ServiceContext.
- Webinar Announcement with No-Code RAG Tutorial: Flowiseās co-founder, Henry Heng, will feature in a LlamaIndex Webinar to demonstrate building no-code Retrieve and Generate (RAG) applications using their new integration with LlamaIndex.TS. The webinar is scheduled for Friday 9am PT and interested individuals can register here.
Links mentioned:
- LlamaIndex Webinar: Build No-Code RAG Ā· Zoom Ā· Luma: Flowise is one of the leading no-code tools for building LLM-powered workflows. Instead of learning how to code in a framework / programming language, users can drag and drop the componentsā¦
- Tweet from LlamaIndex š¦ (@llama_index): š« LlamaIndex v0.10 š« - our biggest open-source release to date, and a massive step towards production-readiness. š ā Ā Create a core package, split off every integration/template into separate PyPi ā¦
- LlamaIndex v0.10: Today weāre excited to launch LlamaIndex v0.10.0. It is by far the biggest update to our Python package to date (see this gargantuan PR)ā¦
LlamaIndex ā· #blog (5 messages):
-
LlamaIndex Hits v0.10 Milestone: LlamaIndex announces its biggest open-source release, v0.10, signaling a shift towards production-readiness. A core package has been created and hundreds of integrations split off into separate PyPi packages as highlighted in their Twitter post.
-
Tutorial on Multimodal Apps with LlamaIndex:
@ollama
and LlamaIndex co-present a tutorial for building context-augmented multimodal applications on a MacBook, including smart receipt reading and product image augmentation, shared via this tweet. -
DanswerAI Enhances Enterprise with LlamaIndex: DanswerAI leverages
@llama_index
to offer ChatGPT functionalities over enterprise knowledge bases, integrating with common workplace tools such as GDrive, Slack, and Jira to boost team efficiency as announced in the Twitter announcement. -
Upcoming No-Code RAG Webinar with FlowiseAI:
@llama_index
teams up with@FlowiseAI
for a webinar on building no-code RAG (Retrieval-Augmented Generation) workflows with LlamaIndex.TS and Flowise, details in their recent tweet. -
Define Research Workflow with RAG-powered Agent: A notebook by
@quantoceanli
outlines a process to establish a scientific research workflow, harnessing LlamaIndex to operate with resources like ArXiv and Wikipedia for an innovative RAG-powered agent, showcased in this tweet.
LlamaIndex ā· #general (303 messagesš„š„):
-
LlamaIndex Import Troubles: Users like
@ddashed
,@bhrdwj
,@lhc1921
, and@cheesyfishes
discuss issues with the latest LlamaIndex update. Users were advised to start with a fresh venv or container and pointed towards a migration guide and package registry for reference. -
Complex Document Filtering Challenges: User
@_shrigmamale
sought assistance in filtering large directories of complex documents based on keywords, dates, and file types. Another user,@qingsongyao
, suggested traditional indexing techniques over expensive LLMs like GPT-4 for dynamic file filtering. -
Efficient Handling of Multiple Document Sources: Users like
@nvmm_
,@whitefang_jr
, and@.saitej
engaged in discussions about handling and merging private user-uploaded documents with public indexed documents using LlamaIndex and the potential for creating multiple agents for individual documents. -
Configuring Chunk Sizes and Testing Performance:
@sgaseretto
asked about where to specifychunk_size
now thatServiceContext
is deprecated in favor ofSettings
.@cheesyfishes
provided the new way to configure chunk size globally or by passing the node parser/text splitter into the index. -
Handling Changes with Chat Memory Buffer:
@benzen.vn
inquired about experiencing non-relevant responses when using aChatMemoryBuffer
.@whitefang_jr
suggested that off-topic conversations might degrade the relevancy of queries and pointed to parts of the LlamaIndex source code for explanation.
Links mentioned:
- Notion ā The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. Itās the all-in-one workspace for you and your team
- Response Modes - LlamaIndex š¦ v0.10.3: no description found
- Notion ā The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. Itās the all-in-one workspace for you and your team
- Google Colaboratory: no description found
- Build a chatbot with custom data sources, powered by LlamaIndex: Augment any LLM with your own data in 43 lines of code!
- Router Query Engine - LlamaIndex š¦ v0.10.3: no description found
- Elasticsearch Vector Store - LlamaIndex š¦ v0.10.3: no description found
- llama_index/llama-index-legacy/llama_index/legacy/vector_stores/mongodb.py at main Ā· run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
- llama_index/llama-index-core/llama_index/core/chat_engine/condense_question.py at 3823389e3f91cab47b72e2cc2814826db9f98e32 Ā· run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
- Usage Pattern - LlamaIndex š¦ v0.10.3: no description found
- Node Postprocessor Modules - LlamaIndex š¦ v0.10.3: no description found
- llama_index/llama-index-core/llama_index/core/indices/base.py at 5d557cb2fe48b90e4056ecae25b9371681752a3c Ā· run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
- Configuring Settings - LlamaIndex š¦ v0.10.3: no description found
- Migrating from ServiceContext to Settings - LlamaIndex š¦ v0.10.3: no description found
LlamaIndex ā· #ai-discussion (1 messages):
- Super-Easy Full Stack RAG App Building Guide Released:
@kerinin
has shared an article about building a Retrieval-Augmented Generation (RAG) application using Dewy, a new open-source knowledge base. The guide entails using NextJS, OpenAI API, and Dewy to create a RAG application that improves the accuracy of language model responses by grounding them in specific, reliable information. Read the guide.
Links mentioned:
Building a RAG chatbot with NextJS, OpenAI & Dewy | Dewy: This guide will walk you through building a RAG application using NextJS for the web framework, the OpenAI API for the language model, and Dewy as your knowledge base.
HuggingFace ā· #announcements (1 messages):
-
Hugging Face Launches Message API: š Hugging Face introduces a new Message API compatible with OpenAI, enabling the use of OpenAI client libraries or third-party tools directly with Hugging Face Inference Endpoints and Text Generation Inference. Learn more from their announcement here.
-
New Open Source Releases and Features: š¤ Datatrove goes live on PyPI, Gradio updates to 4.18.0 with an improved
ChatInterface
and more, and thereās a launch of Remove Background Web for in-browser background removal. Additionally, Nanotron for 3D parallelism training and new features in Hugging Face Competitions were announced. Accelerate 0.27.0 was released, boasting a PyTorch-native pipeline-parallel inference framework. -
Product Innovations at Hugging Face: HF introduces LoRA Studio with a dedicated UI on the Hub, incorporates 2FA support, releases a Mask Generation task page, and announces the arrival of models trained with Axolotl.
-
Partnerships and Learning Resources Expansion: Hugging Face announces a partnership with Codecademy for a new free AI course on transformers and publishes a blog post about SegMoE, which enables model merging on text-to-image models.
-
Optimizing Model Performance: Thereās a technique to load pre-trained PyTorch models approximately 2x faster using Accelerate, detailed in a user guide by
@RisingSayak
.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Releases Ā· gradio-app/gradio: Build and share delightful machine learning apps, all in Python. š Star to support our work! - gradio-app/gradio
- Tweet from Nouamane Tazi (@Nouamanetazi): Super happy to see https://github.com/huggingface/nanotron released today! ā¤ļø Itās been a fun and insightful ride building a library for 3D parallelism training from scratch, and itās crazy tā¦
- Tweet from Zach Mueller (@TheZachMueller): Today is an extra-special release of @huggingface Accelerate! Among other features, this latest version (with collaboration from @PyTorch) integrates a PyTorch-native pipeline-parallel inference framā¦
- Tweet from Omar Sanseviero (@osanseviero): Over 300 models have been trained with axolotl and shared on the Hub! Itās also the cutest icon ever. https://huggingface.co/models?other=axolotl&sort=trending
- Tweet from Sayak Paul (@RisingSayak): Why should LLM kids have all the fun from model merging? Why not us, the diffusion kids? Friends from @_segmind open-sourced SegMoE to reduce this gap š„ Do MoE style merging on text-to-image modelā¦
- Tweet from Sayak Paul (@RisingSayak): š¤ Accelerate power-user chronicles šØāš« Here, I show you how to load a pre-trained PyTorch model ~2x faster with Accelerate. The comments in the code snippet should be self-explanatory. But if yoā¦
HuggingFace ā· #general (192 messagesš„š„):
<ul>
<li><strong>Search Engine Development Struggles</strong>: <code>@spidy___</code> discussed challenges in developing a search engine and extracting keywords with <code>@vipitis</code>, <code>@cubietom</code>, and others. The conversation explored the limitations of NER and alternatives like keyword extraction, TF-IDF, BM25, and the use of spaCy for Part of Speech tagging.</li>
<li><strong>Hosting and Inferencing Challenges</strong>: Users like <code>@sullynaj</code> and <code>@ram1428</code> enquired about hosting custom models and whether serverless inferencing is available, with pointers to server-less or affordable solutions discussed.</li>
<li><strong>Tackling Model Scale</strong>: Conversations with users like <code>@zorian_93363</code> and <code>@xacer_</code> revolved around the feasibility and usefulness of running very large models (100B+ parameters) on typical "open source enthusiast" hardware.</li>
<li><strong>Valentine's Day Vibes</strong>: <code>@not_lain</code> spread love and joy on Valentine's Day, encouraging the community to hug their loved ones.</li>
<li><strong>Discussion on Running Models Locally</strong>: <code>@aj_0003</code> asked about running machine learning models locally while <code>@pierrunoyt</code> discussed using Hugging Face to clone and run a model.</li>
</ul>
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Stable Cascade - a Hugging Face Space by multimodalart: no description found
- Custom architectures with HuggingFace š¤: no description found
- lamm-mit/x-lora Ā· Hugging Face: no description found
- jinaai/jina-embeddings-v2-base-code Ā· Hugging Face: no description found
- Norm/nougat-latex-base Ā· Hugging Face: no description found
- NVIDIA Chat With RTX: Your Personalized AI Chatbot.
- Models - Hugging Face: no description found
- Hugtrip GIF - Hugtrip - Discover & Share GIFs: Click to view the GIF
- Hands-on - Hugging Face Deep RL Course: no description found
- Linguistic Features Ā· spaCy Usage Documentation: spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more.
- Linguistic Features Ā· spaCy Usage Documentation: spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more.
- Hugging Face status ): no description found
HuggingFace ā· #today-im-learning (9 messagesš„):
-
Simple Chatbot Development Blueprint:
@wilbert.comicho
is looking to create a simple chatbot to gather five specific details from a user and send them via email. They are seeking a template to handle database querying, user prompting/saving data and calling an API for email sending. -
AutoGen as a Starting Point:
@dwb7737
suggested using Microsoftās AutoGen for chatbot development and pointed to GitHub for detailed use cases and Jupyter Notebooks. Additionally, highlighted that OpenAI is preferable to open-source LLMs when utilizing AutoGen. -
Starting Small with AutoGen Studio: In a follow-up,
@dwb7737
recommends getting to grips with the basics before diving into AutoGen Studio due to possible behavioral discrepancies and bugs, advocating for an understanding of the underlying processes. They provided a link to AutoGen Studio samples. -
wilbert.comicho: Confirms they will be checking out the recommended resources.
-
Video Guide for Ollama Models:
@dwb7737
shared a YouTube video as an excellent resource for learning how to use Ollama open source models in conjunction with LangChain and Autogen. -
Google Sheets Merge Pitfalls:
@lunarflu
is engaged in merging two Google Sheets and cautions the importance of handling duplicate records and maintaining unique records to prevent issues. -
Creating Transformers with FP8:
@neuralink
has progressed to mastering 99% of doremi reproduction and have advanced their training with end-to-end FP8 in 3D parallelism. -
Switching from AI to Academia:
@sardarkhan_
shares their shift from reading about diffusors and transformers to focusing on their upcoming mid-semester exams.
Links mentioned:
- autogen/samples/apps/autogen-studio at main Ā· microsoft/autogen: Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ - microsoft/autogen
- autogen/notebook at main Ā· microsoft/autogen: Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ - microsoft/autogen
- Ollama - Libraries, Vision and Updates: Ollama Libraries: https://ollama.com/blog/python-javascript-librariesOllama Vision models: https://ollama.com/blog/vision-modelsOllama OpenAI API: https://olā¦
HuggingFace ā· #cool-finds (8 messagesš„):
- Back to ML After a Hiatus:
@charlweed
is diving back into Machine Learning by working on a GIMP 3.0 plugin that connects to Automatic1111, currently facing challenges with posting image data for Image2Image functionality via API. - Digging into the Dirt for Energy:
@Gordo Stoli
shared a research study on Soil Battery, a potential advancement in energy technology. - MoE Security Vulnerabilities Exposed:
@osanseviero
introduced a paper demonstrating how Mixture of Experts (MoE) models are susceptible to adversarial attacks affecting the outputs of benign queries. - Understanding MoE Risks and Mitigations:
@osanseviero
also wrote detailed notes on potential mitigation strategies for the vulnerabilities described in the DeepMind paper, suggesting batch order randomization among other methods, available here. - Questions about MoEās Future Stability:
@meatfucker
highlighted the potential future threat of the reported MoE attack strategy and considered the implications for systems using large batches, which may inadvertently affect output quality.
Links mentioned:
- [@osanseviero on Hugging Face: āMixture of experts: beware š”ļøāļø
New paper by DeepMind: Buffer Overflow inā¦ā](https://huggingface.co/posts/osanseviero/980907000007376): no description found
- Paper page - Buffer Overflow in Mixture of Experts: no description found
HuggingFace ā· #i-made-this (10 messagesš„):
- Quiz Generation Anticipation:
@lunarflu
suggested the addition of a loading screen or bar for the quiz generation process, mentioning an issue with just waiting for the quiz to appear without any indication. - Automated Image Tagging Model Deployed:
@not_lain
announced an automated model for tagging images pertinent to diffusion tasks and gave instructions for use, along with a link to their discussion. They also mentioned the modelās implementation improvements inrefs/pr/2
. - Model Supports Various Image Formats:
@not_lain
highlighted that their tagging model accepts input as a string (path), a PIL image, or a numpy array, showcasing flexibility in handling images. - AI for Anime Data Set:
@not_lain
expressed intentions to use their image tagging model to annotate an anime dataset, while@imcoza1915
commented on the coolness of the tool. - New āRemixā Mode for Image Transformation:
@matthieulc
shared an update to PanoramAI.xyz, introducing a āremixā mode withControlNet
technology for better structure preservation in image transformations. Users are reminded they can navigate the tool using arrow keys. - From Sketch to Fashion with AI:
@tony_assi
unveiled a project Sketch to Fashion Design with great pride, which has received positive feedback as an AI able to understand designs, as@chad_in_the_house
implied.
Links mentioned:
- panoramai: whatās in your world?
- Sketch To Fashion Design - a Hugging Face Space by tonyassi: no description found
- p1atdev/siglip-tagger-test-3 Ā· Upload folder using huggingface_hub: no description found
HuggingFace ā· #reading-group (32 messagesš„):
- S4 Architecture Gets Annotated:
@ericauld
shared a resource on āThe Annotated S4ā asking for feedback and pointing out its usefulness for understanding the S4 architecture, which excels in modeling very long-range sequence tasks. They indicated that reading it may help clarify the model before their upcoming talk on Mamba/S4. - Seeking Clarity on S4 Implementation:
@austintb.
expressed desire for clarification on the S4 architectureās implementation and computational complexity details.@chad_in_the_house
echoed the sentiment, requesting intuitive explanation of concepts and prior work such as the hippo codebase, later suggesting a focus on intuition and coding for ericauldās main talk. - Mamba/S4 Talk Schedule and Content Preferences:
@ericauld
proposed scheduling the Mamba/S4 talk for Friday at 10am California time and suggested potential content for the primary and secondary (math-focused) sessions based on community feedback. - LangTest Paper Makes Its Debut:
@prikfy
announced the publication of their LangTest paper in the Software Impacts journal, a tool for testing and augmenting NLP models. The paper and the GitHub repository for LangTest were shared, with@ryzxl
contributing further context on its comprehensive testing capabilities and how to get started using the library.
Links mentioned:
- Structured State Space Models for Deep Sequence Modeling (Albert Gu, CMU): Date: May 26, 2023(Sorry that the first 2 slides are not recorded, those are motivation slides though.)Abstract: This talk will cover recent deep neural netwā¦
- The Annotated S4: no description found
- GitHub - JohnSnowLabs/langtest: Deliver safe & effective language models: Deliver safe & effective language models. Contribute to JohnSnowLabs/langtest development by creating an account on GitHub.
- LangTest | Deliver Safe & Effective Models | John Snow Labs: no description found
HuggingFace ā· #diffusion-discussions (10 messagesš„):
-
Multi-GPU Training Inquiry: George is looking for advice on adapting the
train_text_to_image.py
script for multi-GPU usage, mentioning previous experience withnn.DataParallel
. -
Deployment Options for finetuned models:
@lokendra_71926
finetuned the mistarl_7b_gptq model and is seeking recommendations for a library or platform suitable for fast inference deployment. -
Success with Stable Cascade:
@isidentical
asked if anyone achieved good text generation with stable cascade, similar to the examples in the readme and confirmed getting 50% success on arbitrary words with the right prompting strategy. -
HuggingFaceās Inference Engine Suggestion:
@chad_in_the_house
suggested that HuggingFace has an inference engine that could potentially serve for llms deployment and also mentioned that the discussion might be more appropriate in another channel. -
Terminus Model Anticipation:
@pseudoterminalx
teased that a new terminus model is still in the development phase.
HuggingFace ā· #computer-vision (7 messages):
-
Hierarchical Image Classification Challenge:
@cropinky
described the issue of hierarchical image classification and advised that the complication level depends on the quality and amount of data. They suggested checking out an ECCV22 paper and related datasets on paperswithcode for further research. -
In Search of Gaussian Splats:
@aeros93
inquired about resources or pre-trained models for creating Gaussian splats from point clouds or images. No specific resources were provided, but@johko990
redirected the query to another channel that could potentially help. -
Quest for Multimodal Project Insights:
@joee2711
is working on a multimodal project and sought clarification on the difference between Q-former / MLP connector and if MLP connectors and adapters are the same. They also expressed an interest in connecting with others working on similar projects. -
Enhancing Image Retrieval Systems: User
@femiloye
is developing an image retrieval system akin to person reidentification and is looking for methods to improve match accuracy beyond using model embeddings. They are currently utilizing a custom deit transformer trained with reid loss for this purpose.
HuggingFace ā· #NLP (4 messages):
-
Fine-tuning Mistral for Deployment:
@lokendra_71926
fine-tuned mistarl_7b_gptq model on custom data and is seeking recommendations for a library or platform for deployment to achieve faster inference. -
Language Identification with XLM-R:
@_michaelsh
inquired about how to extract the language from xlmr after reading a HuggingFace post which explains that XLM-RoBERTa does not require language tensors to understand the language being used. -
From Natural Language to Algebraic Representations:
@_david_valente_
is looking for research or work that has focused on translating natural language into algebraic representations such as LEAN. -
Voice Simulation and Language Transformation with Transformers:
@mentrass
asked about methods to simulate oneās voice and alter the language using transformer models.
HuggingFace ā· #diffusion-discussions (10 messagesš„):
- Multi-GPU Adaptation Inquiry:
@George
is looking for an easy way to adapt thetrain_text_to_image.py
script for multi-GPU usage, noting past experience withnn.DataParallel
. - Deployment Platform for finetuned model:
@lokendra_71926
has finetuned the mistarl_7b_gptq model and is inquiring about a library or platform for fast inference deployment.@chad_in_the_house
suggests looking at Hugging Face inference engine for LLMs. - Text Generation with Stable Cascade:
@isidentical
questions whether anyone has been able to achieve text generation with stable cascade as showcased in the modelās readme, later confirming a 50% success rate with good prompting. - Inference Optimization Discussion Redirected:
@chad_in_the_house
points out that discussions regarding inference optimization should move to a different channel titled<#1019883044724822016>
. - Anticipation for New Terminus Model:
@pseudoterminalx
indicates that a new terminus model is currently being developed.
Nous Research AI ā· #ctx-length-research (3 messages):
-
DAMO-NLP-SG Releases Vast Long-Context Dataset:
@giftedgummybee
shared the LongCorpus-2.5B dataset which contains 2.5B tokens collected from various domains for long-context continual pre-training. The datasetās composition is inspired by Long-Data-Collections, and its selection criteria ensures a low n-gram similarity with the training set to exclude QA and Summarization data. -
Scaling Models with āropeā vs āself-extendā:
@blackl1ght
highlighted that scaling models with āself-extendā can preserve coherence better than ārope scalingā, even at larger scaling factors, referring to the implementation in llama.cpp. -
Ease of āself-extendā Implementation:
@blackl1ght
noted the benefits of āself-extendā including no need for setup, fine-tuning, or extra parameters like those required in the āgguf configurationsā for quants.
Links mentioned:
DAMO-NLP-SG/LongCorpus-2.5B Ā· Datasets at Hugging Face: no description found
Nous Research AI ā· #off-topic (8 messagesš„):
-
Discussing LangGraph Agentsā Perseverance:
@pradeep1148
shared a YouTube video titled āLangGraph Agents Persistence,ā highlighting that LangGraph agents can be set up to retain their state across interactions. -
Geminiās Resistance Frustrates Users:
@llmaniac1000
expressed disappointment with Geminiās frequent refusal tendencies, seeking othersā experiences with it.@n8programs
chimed in, stating itās not amazing and implying GPT-4 outperforms Gemini. -
Mark Zuckerbergās Image Transformation:
@nonameusr
shared a Twitter post suggesting that Zuckerberg has transitioned from villain to savior in the context of AI and VR. -
A Touch of Humor with GIFs:
@error.pdf
reacted to previous discussions using humor by sharing a GIF from Tenor, without providing further commentary or context.
Links mentioned:
- Rock Cat Eyebrow Cat GIF - Rock cat Eyebrow cat Meme - Discover & Share GIFs: Click to view the GIF
- LangGraph Agents Persistence: When creating LangGraph agents, you can also set them up so that they persist their state. This allows you to do things like interact with an agent multiple ā¦
Nous Research AI ā· #interesting-links (17 messagesš„):
-
Mesmerizing Mandelbrot Beauty Shared:
@gabriel_syme
posted a stunning visualization of the Mandelbrot set.@_3sphere
added that the setās focus on divergence contributes to its sense of complexity and order. -
Crowdsourcing AI with āMarvā Chatbot:
@.dvs13
praised a crowdsourcing project and noted ambiguity in the term āprompt.ā The project involves a chatbot named Marv, which answers questions with sarcasm. -
Reka Introduces Multi-Modal AI Models:
@metaldragon01
highlighted the launch of Reka Flash, a 21B fast multimodal language model, alongside its smaller counterpart Reka Edge. Reka Flash boasts competitive performance to major models like Gemini Pro and GPT-3.5 and is available in public beta. -
Pursuing CUDA Compatibility with AMD:
@leontello
shared a GitHub project, ZLUDA, which aims to run CUDA on AMD GPUs. Unfortunately, the project is no longer actively pursued as detailed by@adjectiveallison
, who quoted the projectās lead expressing itās effectively abandoned. -
Wavelets Meets Transformers in AI Research: An arXiv paper shared by
@euclaise
suggests that wavelet transforms could enhance Transformers by capturing both positional and frequency information with linear complexity. The paper details Wavelet Space Attention (WavSpA) and has been tested on the Long Range Arena. Find the paper here.
Links mentioned:
- @dvilasuero on Hugging Face: āš¤ Data is better together!Data is essential for training good AI systems.ā¦ā: no description found
- Reka Flash: An Efficient and Capable Multimodal Language Model - Reka AI: Reka Flash is a state-of-the-art 21B model trained entirely from scratch and pushed to its absolute limits. It serves as the āturbo-classā offering in our lineup of models.
- WavSpA: Wavelet Space Attention for Boosting Transformersā Long Sequence Learning Ability: Transformer and its variants are fundamental neural architectures in deep learning. Recent works show that learning attention in the Fourier space can improve the long sequence learning capability of ā¦
- GitHub - vosen/ZLUDA: CUDA on AMD GPUs: CUDA on AMD GPUs. Contribute to vosen/ZLUDA development by creating an account on GitHub.
- GitHub - acorn-io/rubra: AI Assistants, LLMs and tools made easy: AI Assistants, LLMs and tools made easy. Contribute to acorn-io/rubra development by creating an account on GitHub.
Nous Research AI ā· #general (180 messagesš„š„):
- New Model Training Begins:
@n8programs
excitedly shares the start of training a new model, mentioning terms like dachshund, neuralbeagle-dpo, and expressing the process as randomly throwing stuff together genetic algorithm-style. - Playful Banter About Model Merging:
@teknium
humorously notes the metaphorical alignment between dog breeds and model merging, while@leontello
likens the mixing methods to evolutionary strategies, and@n8programs
reports a horrifying outcome of his merging experiment. - Typo Alert in Model Card:
@everyoneisgross
reports a typo in Hugging Faceās model card for 70B llama, which was swiftly corrected by@teknium
, leading to expressions of congratulations on the model launch. - Quantization Quest: Discussion about post-training quantization methods, with
@stellaathena
sharing a link to a new quantization method, and@nruaif
jokingly looking forward to even lower bit-precision. - AI Activation Additions: A deep dive into activation hacking is mentioned, with
@filipvv
referencing an external article and@mihai4256
discussing their plans to refine their approach, while@proprietary
voices interest in the work.
Links mentioned:
- QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks: Post-training quantization (PTQ) reduces the memory footprint of LLMs by quantizing their weights to low-precision. In this work, we introduce QuIP#, a weight-only PTQ method that achieves state-of-thā¦
- Tweet from jf (@fejo_11): Mixtral 8x7B: Routing Analysis based on POS tags I conducted a routing analysis using @MistralAIās Mixtral 8x7B model, focusing on Part-of-Speech (POS) tags, diverging from the original methodoloā¦
- NousResearch/Nous-Hermes-2-Llama-2-70B Ā· Hugging Face: no description found
- Representation Engineering Mistral-7B an Acid Trip: no description found
- OpenAI Researcher Andrej Karpathy Departs: Andrej Karpathy, one of the founding members of OpenAI, has left the company, a spokesperson confirmed. Karpathy, a prominent artificial intelligence researcher, was developing a product he has descriā¦
- Xigmoid: An Approach to Improve the Gating Mechanism of RNN: This work proposes an innovative approach for the gating mechanism of RNN class models. A transfer function is embedded into the original sigmoid to form a new gate function called xigmoid. The purposā¦
- [missing post]: no description found
- Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning: The recent surge of generative AI has been fueled by the generative power of diffusion probabilistic models and the scalable capabilities of large language models. Despite their potential, it remains ā¦
- āPractical AI: Machine Learning, Data Science on Apple Podcasts: āTechnology Ā· 2024
- Steering GPT-2-XL by adding an activation vector: Summary: We demonstrate a new scalable way of interacting with language models: adding certain activation vectors into forward passes.[2] Essentially, we add together combinations of forward passes inā¦
Nous Research AI ā· #ask-about-llms (38 messagesš„):
-
DeepSeekMath Merges into the Conversation: User
@yxzwayne
inquired about the integration of newly introduced deepseekMath in merging strategies, indicating interest in its application. -
Finetuning for Dummies Guide Discovered:
@nemoia
was searching for straightforward instructions on how to finetune Mistral 7B and create their own datasets and later shared a helpful Medium guide that provides detailed examples and explanations on the process. -
Forced FA2 Line Causes Memory Issues: In response to a question about FA2 not being enabled,
@bloc97
clarified that the problem was related to an attempt to create a largeattn_weights
matrix, indicating the line of code causing memory issues can be seen here. -
Secondary Options for Coding Models:
@natefyi_30842
was looking for a less expensive alternative to GPT-4 for a coding model, and@teknium
suggested trying out the deepseek coder, which is hosted by ātogether.ā -
MIQU Modelās Pretraining and SFT Clarified:
@teknium
explained to@yxzwayne
that the MIQU model was first pretrained on the Llama-2 70b and then underwent SFT (Supervised Fine Tuning), focusing specifically on instruction-focused data.
Links mentioned:
- Tweet from AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: Itās Now Open-Source - Phoronix: no description found
- Finetuning Llama 2 and Mistral: A beginnerās guide to finetuning LLMs with QLoRA
- Training a causal language model from scratch - Hugging Face NLP Course: no description found
- modeling_mistral_yarn.py Ā· NousResearch/Yarn-Mistral-7b-128k at main: no description found
Nous Research AI ā· #collective-cognition (3 messages):
- Project Downfall Due to Chat GPT Update:
@adjectiveallison
inquired if a project was still active after encountering issues accessing the site.@teknium
responded, clarifying that the website broke due to the new Chat GPT update with various modes, leading to the original team being unable to maintain it. - Sympathies for the Broken Project:
@adjectiveallison
expressed disappointment upon learning that the project was no longer maintained following the complications with the new Chat GPT update.
Mistral ā· #general (43 messagesš„):
- Model Selection Advice for Beginners: Newcomer
@nana.wav
inquired about the best models to use, and@afriendofmaurice
recommended instruct models for chat-GPT-like interactions.@mrdragonfox
clarified that instruct models are more focused on instruction following, whereas others are akin to raw autocomplete. - Integration with Visualization Libraries:
@carnivore5
asked if anyone had experience integrating Mistral functionalities with GraphViz or similar visualization libraries, leading to a clarification by@mrdragonfox
about Mistralās lack of inherent function-calling ability. - Chat vs. Completion Endpoints:
@i_am_dom
and@mrdragonfox
discussed the difference between Mistralās/chat/completion
and a wished-for raw/completion
endpoint, with most usage currently gravitating towards the chat endpoint. - Internship Struggles with Mistral:
@nana.wav
shared struggles with learning how to use downloaded models and intentions to fine-tune them, leading@mrdragonfox
to advise starting with simpler steps. The conversation included sympathy and reminiscence from others, highlighting the common intern experience with overwhelming tasks. - Mistral API Latency Issues:
@justinmann.
reported inconsistent latencies when using the Mistral API, with response times varying drastically from under a second to over a minute.@sublimatorniq
suggested contacting support for assistance.
Mistral ā· #models (20 messagesš„):
- RAG Guide for Mistral:
@ethux
shared a helpful guide explaining how Mistral works with RAG (Retrieval-Augmented Generation), including steps on retrieval and generation with examples from Mistral, LangChain, and LlamaIndex. - Debate on LangChain vs. LlamaIndex:
@sublimatorniq
sparked a discussion on the effectiveness of LangChain vs. LlamaIndex, with@rabdullin
expressing skepticism about their use in serious LLM-driven products. - DSPy Advocacy:
@mrdragonfox
advocated for DSPy as a powerful framework, citing that it uses LLM as a ādeviceā and not a āchatā interface and linked to a Twitter post exemplifying its strength. - Mistral-7b Training Dataset Inquiry:
@kushagra_67246
inquired about the datasets on which Mistral-7b is trained, receiving humorous and vague responses indicating a mixed variety of internet sources ā from@tom_lrd
describing it as āTop secret magic soupā to@gamerboi0129
listing textbooks and Wikipedia among other comprehensive sources. - Clarification on Raw Pretraining Checkpoint:
@nofreewill42
asked for an open-sourced checkpoint of the Mistral model right after raw text pretraining, expressing thatmistralai/Mistral-7B-v0.1
seemed too interactive to be raw.
Links mentioned:
- Basic RAG | Mistral AI Large Language Models: Retrieval-augmented generation (RAG) is an AI framework that synergizes the capabilities of LLMs and information retrieval systems. Itās useful to answer questions or generate content leveraging ā¦
- GitHub - stanfordnlp/dspy: DSPy: The framework for programmingānot promptingāfoundation models: DSPy: The framework for programmingānot promptingāfoundation models - stanfordnlp/dspy
Mistral ā· #deployment (46 messagesš„):
- Docker Deployment Recommendations:
@rusenask
suggested checking out the ollama or vllm projects for APIs that can be run through Docker for different use cases. - Quota Troubles in the Cloud:
@gridinoc
experienced difficulties deploying Mixtral with SkyPilot as AWS, Google Cloud, and Azure either denied quota increases or did not respond to requests. - Alternatives to Self-Hosting:
@mrdragonfox
discussed options for deployment, suggesting cheaper API offerings such as direct mistral or together.ai, despite the current GPU shortages and quota issues faced by@gridinoc
. - AWQ Quantization Hitches with MoE: Multiple users, including
@mrdragonfox
and@casper_ai
, discussed issues with the AWQ quantization method and Mixtral models, with@casper_ai
recommending an alternative working repository hosted on Hugging Face. - Success with HuggingFace Deployment:
@ethux
pointed to an instance of Mixtral deployed on HuggingFace.co/chat, offering an alternate route to those facing cloud service barriers.
Links mentioned:
- Deploy with SkyPilot | Mistral AI Large Language Models: SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.
- casperhansen/mixtral-instruct-awq Ā· Hugging Face: no description found
- TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ Ā· Hugging Face: no description found
- HuggingChat: Making the communityās best AI chat models available to everyone.
- TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ Ā· always getting 0 in output: no description found
Mistral ā· #finetuning (76 messagesš„š„):
- Fine-Tuning vs RAG Explained: Users in the channel debated the merits of fine-tuning versus using Retrieval-Augmented Generation (RAG), with
@rabdullin
advising to focus on prompt engineering and@mrdragonfox
highlighting the importance of base knowledge in a Large Language Model (LLM) when using RAG.@tom_lrd
and@mrdragonfox
outlined that RAG acts as middleware to provide relevant context for the LLM and has its own complex underlying processes. - Onboarding the New AI Enthusiast: In response to
@1mbc
seeking resources for understanding AI core concepts,@mrdragonfox
and@tom_lrd
provided insights into how RAG and GPTs work and suggested platforms like Medium for further learning. No specific resources were linked. - Chatbot Integration Strategies Shared: The conversation delved into the technicalities of feeding an LLM with personalized data, with
@mrdragonfox
and@tom_lrd
describing how data can be pre-processed and turned into a structured format that enriches the LLMās output, specially when using an LLM as a middleware to process user input. - Clarifying Misconceptions on LLM Data Storage:
@mrdragonfox
corrected some misconceptions about how an LLM ālearnsā from new data, such as the functions of GPTs and the significant complexity behind embedding and search before data becomes usable context for an LLM. - Prompt Versioning Tools Inquiry:
@khandelwaal.ankit
inquired about tools for prompt versioning during fine-tuning experiments, noting a lack of support for Mistral models in some existing tools like PromptLayer; however, no solutions were specifically endorsed or detailed in the discussion.
Mistral ā· #showcase (2 messages):
- Limits on Code Modification:
@ethux
expressed skepticism about the possibility of making a certain change, suggesting that it might not be possible without altering some code.
Mistral ā· #random (15 messagesš„):
- French Librarian Seeks Internship Opportunities for Student: User
@maeelk
, a French librarian, is promoting AI use and looking for an internship for a student studying psychology and AI, referring to the Masterās program at the University of Savoie Mont Blanc. Interested parties can reach out for collaboration via[email protected]
. - Mistralās Fan Quiz: User
@akshay_1
challenges@maeelk
ās Mistral fandom by asking them to list the weights of the 7b model. Another user,@ethux
, responds humorously, implying the difficulty of listing such technical details. - Building Audio-Inclusive S2S Models on a Shoestring Budget:
@akshay_1
shares a clientās request to build an S2S model with a persona, fine-tuned with an audio dataset on a budget of $1,000. Several users, like@ethux
and@mrdragonfox
, react to the insufficient budget, implying that much more would be required. - The Price of Innovation:
@skadeskoten
inquires about the competitive budget for creating a specialized S2S model, to which@mrdragonfox
responds that the cost greatly depends on the extent of architecture needed.
Links mentioned:
Ergonomie socio-cognitive des systèmes intelligents - Classique et alternance - Ametys Campus - Université Savoie Mont Blanc: no description found
Mistral ā· #la-plateforme (2 messages):
- API Key Confusion for TypingMind:
@ingohamm
reported issues with using the API key for TypingMind, despite having a subscription and payment method in place. He mentioned that trying after a wait or deleting the API key prompted a message about no active subscription, and questioned the status of his account or subscription. - Seek Support from Mistral: In response to the issue,
@sublimatorniq
suggested that@ingohamm
reach out to [email protected] for assistance with his API key and subscription concerns.
Perplexity AI ā· #general (149 messagesš„š„):
- Seeking Support for Company Data Issues: User
@kitsuiwebster
expressed that sending emails for support got no response and preferred not to disclose the data-related issue publicly. Instead, they wished to contact directly for help with a company-related problem. - Debating the Merits of Perplexity vs. Phind: User
@ludwig_von_mises_fan
opened a discussion about the effectiveness of Phind over Perplexity for coding and general search, while@gooddawg10
and@brknclock1215
defended Perplexityās search capabilities, with no preference for coding. - Experiencing Technical Difficulties with Perplexity: Users
@yellephen
,@luke_____________
, and@chenlieong
reported issues with the Perplexity chatbot, such as endless loading for answers and service unavailability;@dima_shliugaev
from the team acknowledged the issue and it was confirmed to be back online by@vova_at_pplx_ai
. - Model Performance and Usage Discussions: Users shared their experiences with different AI models for tasks such as code debugging (
@matheusgnhr
), tic-tac-toe (@noremac258
), and PDF reading (@reader7904
); queries regarding specific model details (@hzpd
and@unknownuser787
) and API usage (@pilotgfx
) were also seen. - Subscription Details and Model Information Inquiry: Users
@stocktown
and@ewaathescientist
sought clarification on trial subscriptions and the renewal of Pro subscriptions, while@voidfulness
inquired about token refresh rates and was informed by@me.lk
that tokens refresh 24 hours after use.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- āWhat Gemini Apps can do and other frequently asked questions: Learn what Gemini can do, how it works, and different ways to get access to it.
- More than an OpenAI Wrapper: Perplexity Pivots to Open Source: Perplexity CEO Aravind Srinivas is a big Larry Page fan. However, he thinks heās found a way to compete not only with Google search, but with OpenAIās GPT too.
- What is a Google dork query and how to protect yourself?: A Google dork query is a search string using advanced search operators. See how hackers get website data not readily available with it and how to protect from it.
- Coupons and discounts: Explore Perplexityās blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
- Introducing PPLX Online LLMs : The first-of-its-kind Online LLM API
Perplexity AI ā· #sharing (13 messagesš„):
- Perplexity AI Tackles Tough Test Question:
@tbrams
was impressed with how quickly Perplexity AI handled a complex question from the āGeminiā paper, a task that Googleās Gemini service and OpenAI took longer to address. Details of this successful test are available on the Perplexity AI platform. - Community Contributions and Creations:
@twodogseeds
gave a shoutout to Perplexity for the pplx shortcut action, which supports their Farm Friend research agent. No further details were shared in the message. - Exploring Diverse Perspectives with Bryan Johnson:
@ok.alex
shared a link to a summary of Bryan Johnsonās perspectives via Perplexity AI, while@brknclock1215
offered an alternative angle for scientific summarization. Links to these summaries are found at Bryan Johnson Summary and Scientific Summary respectively. - Engage with the Alt-D-Feed:
@ok.alex
invited the community to contribute to an alternative feed/newsletter, suggesting it as a collaborative project to curate together. Interested individuals can like and share this initiative. - Summarizing Documents in Seconds!:
@aykbl
expressed enthusiasm for Perplexity AIās capability to summarize documents swiftly, emphasizing its speed with a smiley face. The content linked or specificity of documents was not mentioned.
Links mentioned:
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Perplexity AI ā· #pplx-api (24 messagesš„):
- Custom Search Queries Available: User
@me.lk
clarified that by using search parameters such as"site:reddit.com OR site:youtube.com"
in prompts, one can specify multiple content sources when using the API. - Performance Issues with Online API:
@andrewgazelka
reported performance problems withpplx-70b-online
, but noted that removing the system message in the code seemed to fix the issue. - PPLX API Fails with Nonsensical Responses:
@myadmingushwork_52332
raised a concern with the API returning random and nonsensical replies involving a mix of numbers and characters when online searching is required. - Reference Provision Under Development:
@dvrshil
expressed a desire for Perplexity to provide references in API responses, to which@mares1317
responded, stating that the development team is working on this feature. - No Early Access Program Yet:
@icelavaman
indicated that early access to new Perplexity features is not available at this moment; announcements for new features will come at a later date.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Pricing: no description found
OpenAI ā· #annnouncements (1 messages):
-
ChatGPT Gets a Memory Boost:
@abdubs
announced a new memory feature for ChatGPT which allows it to remember user preferences and details across conversations, thereby enhancing future interactions. This feature is rolling out to select Free and Plus users, with control options available at ChatGPT Memory Features. -
Youāre the Boss of ChatGPTās Memory: Users have the power to tell ChatGPT what to remember, ask it to recall information, and instruct it to forget things conversationally or through settings. The memory feature can also be turned off completely if preferred.
-
Memory Feature Rolling Out Gradually: OpenAI is currently deploying the memory upgrade to a limited user base and plans to gather feedback to gauge its usefulness. Further announcements regarding a broader rollout will be made soon.
Links mentioned:
Memory and new controls for ChatGPT: Weāre testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. Youāre in control of ChatGPTās memory.
OpenAI ā· #ai-discussions (83 messagesš„š„):
-
Discovering the Secrets of SEO:
@spidy___
sought insights on how to autonomously tag webpages with relevant keywords like web crawlers do, finding limitations in NER for keyword extraction.@light.grey.labs
advised examining SEO files, as web builders often embed a variety of keywords into these for search relevance. -
Seeking Creative Minds for AI Research:
@noodles7584
, a UK researcher, invited community members to discuss how AI is used in creative processes, offering compensation for the 30-minute discussions. -
The Quest for the Ultimate Chatbot: Chat explored the challenges with current chatbots, including the inability of GPT models to meet all individual needs, voiced by
@jaicraft
.@lumirix
and others discussed workarounds, like combining bots or leveraging chatbot integrations with services like Google Docs. -
ChatGPT Accused of Laziness:
@pigondrugs
and others commented on GPTās difficulty retaining context, with growing complaints after context capacity increased. In contrast,@drinkoblog.weebly.com
argued that higher context limits reduce perplexity, leading to better performance. -
AI Model Rivalry Heats Up:
@cassofthenight
spotlighted Abacus.AIās Smaug-72B model outperforming GPT-3.5 and expressed concerns over ChatGPT-4ās reluctance to produce complete code snippets, suggesting that the AI dodges detailed scripting in favor of pseudo code.
Links mentioned:
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
OpenAI ā· #gpt-4-discussions (49 messagesš„):
-
GPT 4 Turbo Cost Queries Clarified:
@jeremy.o
directed@koukyo
to the OpenAI pricing page for GPT 4 cost details and noted that GPT 4 Turbo is the top/cheapest model, costing 2 cents less than other versions, with similar or slightly worse quality depending on use. -
GPT 4 Sometimes Slacks Off?:
@rodney.leonardo
reported a decrease in GPT 4ās intelligence in basic tasks, like summarizing a PDF. Community members including@blckreaper
confirmed observations of performance issues, and discussions on the topic are collected in a separate channel: <#1047565374645870743>. -
Still Waiting for @mentions: Users including
@pax0086
and@ancryp
discussed the gradual rollout of the @mention feature in GPT, with@darkninjaforever
reminding that OpenAI often does gradual feature rollouts, indicating some users are still awaiting access. -
Trying to Push the Boundaries of GPTās Vision:
@flokyhuan
inquired about using videos for fine-tuning language models and was informed by@solbus
that fine-tuning is currently only available for text models, and while the GPT vision feature can describe images from a video, it canāt be fine-tuned for specific knowledge like sports rules. -
ChatGPT Memory Feature Rollout Progresses:
@lumirix
confirmed that the ChatGPTās feature for remembering past conversation details is being rolled out to both free and Plus users but noted that itās only available to a small portion of users at this time.
Links mentioned:
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
- Processing and narrating a video with GPTās visual capabilities and the TTS API | OpenAI Cookbook: no description found
OpenAI ā· #prompt-engineering (23 messagesš„):
- Prompt Engineering Basics Explained:
@eskcanta
outlined that good prompt engineering involves using precise language, giving clear instructions to the AI, and careful review of the AIās output. When instructing the AI, focus on what to do instead of what not to do, avoiding conflicting instructions. - AI Text Adventures Streamlined:
@drinkoblog.weebly.com
advised@stealth2077
to use custom instructions like āFocus on simple storytelling and character dynamicsā to keep narratives straightforward and avoid complexity, which the AI tends to default to in text adventures. - Navigating Platform Confusion:
@beanz_and_rice
humorously attempted to engage with ChatGPT on the Discord server, prompting@toror
to respond with amusement at the unsuccessful effort. - API Infrastructure vs. Prompt Engineering:
@darthgustav.
clarified the difference between prompt engineering and API infrastructure to@kate.yanchenka
, suggesting that the latterās queries about automated budget calculations and dynamic data handling were related to software development rather than prompt engineering.
Links mentioned:
- no title found: no description found
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
OpenAI ā· #api-discussions (23 messagesš„):
- Newbie Seeks Prompt Engineering Wisdom:
@zzavior
sought advice for getting started with prompt engineering.@eskcanta
provided an extensive guide focusing on using clear and precise language, checking the output, and ensuring not to trigger conflicts with the AIās capabilities or training. - Library Queries for Prompt Engineering:
@kate.yanchenka
inquired about libraries for prompt engineering to manage budgets, fit dynamic data, and handle AI model fallbacks.@darthgustav.
clarified that the topic was more about AI software development than prompt engineering. - Conversation Assistance Request Goes Unnoticed:
@beanz_and_rice
attempted to initiate an interaction using Discord slash commands but failed, followed by a comedic outcry that prompted a reaction from@toror
. - Crafting Lightweight Text Adventures:
@stealth2077
asked for tips on making a text adventure less deep and thematic.@drinkoblog.weebly.com
suggested using custom instructions to guide the AI towards simpler storytelling. - Joke Generation Confusion:
@lisabkk45_48614
requested a joke, but@solbus
directed them to use the official ChatGPT website instead of the Discord channel.
Links mentioned:
- no title found: no description found
- Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
OpenAccess AI Collective (axolotl) ā· #general (66 messagesš„š„):
- MPS Support Acknowledgment and Clarification:
@caseus_
expressed gratitude for MPS support in the axolotl project thanks to a GitHub pull request #1264 by Maxime. Confusion arose about the contributorās Discord identity, and@yamashi
(the right Maxime) clarified their involvement, noting the dependency on transformers merging changes and the PyTorch pull request #99272 as crucial for further development. - Tinkering with Yi-34b Training and eBay Finds: Users
@le_mess
,@cookiesowns
, and@c.gato
discussed various AI and non-AI topics ranging from slow loss decrease during Yi-34b training to an eBay link for an old tech product. - Exploring Model Adaptation and Enhancements:
@yamashi
suggested the potential benefits of porting models to Keras for wider hardware support, and@dreamgen
and@c.gato
discussed error handling and fixes related to Hugging Face checkpoint saving, in light of a pull request #1414 and a related issue #1452. - Queries on Cheapest LLM Endpoint Services:
@le_mess
inquired about affordable LLM endpoint services with responses pointing to local options like llamacpp, external services such as Together AI, and OpenRouterās cost-effectiveness. Users mentioned JSON serialization issues with Basten and the need for custom configurations. - Discussion of Various Challenges Using LLMs: Issues like JSON serialization (
@dangfutures
), challenges with FP32 slowness (@yamashi
), and need for additional documentation were discussed providing snapshots of technical hurdles and collaborative problem-solving occurring in the AI community.
Links mentioned:
- Together AI: Build gen AI models with Together AI. Benefit from the fastest and most cost-efficient tools and infra. Collaborate with our expert AI team thatās dedicated to your success.
- peft/utils/save_and_load.py try to connect to the hub even when HF_HUB_OFFLINE=1 Ā· Issue #1452 Ā· huggingface/peft: System Info peft 0.8.2 axolotl v0.4.0 export HF_DATASETS_OFFLINE=1 export TRANSFORMERS_OFFLINE=1 export HF_HUB_OFFLINE=1 Who can help? No response Information The official example scripts My own moā¦
- Intel® Optane⢠Persistent Memory 300 Series (128GB PMem Module) NMC2XXD128GPS | eBay: no description found
- GitHub - triton-inference-server/tensorrtllm_backend: The Triton TensorRT-LLM Backend: The Triton TensorRT-LLM Backend. Contribute to triton-inference-server/tensorrtllm_backend development by creating an account on GitHub.
- Add MPS support by maximegmd Ā· Pull Request #1264 Ā· OpenAccess-AI-Collective/axolotl: Description Supports basic training on Mac M series. Motivation and Context It partially solves Mac support. How has this been tested? Ran a train job with lora-mps.yml from start to finish.
- Fix breaking change by younesbelkada Ā· Pull Request #1414 Ā· huggingface/peft: Fix a breaking change in the recent release, I made a new PR as I messed up the commit history on the previous PR cc @sayakpaul @pacman100
OpenAccess AI Collective (axolotl) ā· #axolotl-dev (8 messagesš„):
-
Converging on Chat Dataset Formats:
@dctanner
is coordinating with Hugging Face to standardize a chat dataset format named MessagesList, to streamline the various dataset formats emerging for fine-tuning chat models. They shared a link to the MessagesList proposal discussion and suggested creating a GitHub org and dedicated page for documentation. -
Naming Conventions Matter:
@dctanner
emphasized the importance of a universal format name like MessagesList thatās not tied to a specific app like ShareGPT or ChatML, which can be confused with the template rather than the JSON format itself. -
Validation Challenges for MessagesList:
@faldore
acknowledged that although they like the idea of MessagesList, it poses challenges in validation because the concept of a āconversation pairā is not easily described by JSON-schema. -
The Ideal MessagesList Schema:
@faldore
proposed an ideal schema for the MessagesList format that includes optional system messages, tools/functions, source metadata, and a greeting message, ensuring user and assistant messages are paired, and the last message is from the assistant. -
Benefits of the Suggested Schema:
@faldore
advocates for the proposed schema, arguing that it is more manageable, verifiable, and space-efficient, and enforces structured message pairing in datasets.
Links mentioned:
@dctanner on Hugging Face: āAs the amount of datasets for fine tuning chat models has grown, thereās beenā¦ā: no description found
OpenAccess AI Collective (axolotl) ā· #general-help (26 messagesš„):
-
Tokenization Troubles in Axolotl: User
@nafnlaus00
enquired about a method to verify that axolotl is tokenizing as expected.@dreamgen
recommended inspecting the tokenizer config in the output directory, while@nanobitz
pointed to a debug flag in the axolotl repository. -
Transformers Update Might Fix Inferencing Issue:
@thierry_lama
reported a device error while trying to infer on a trained model using runpodās GPU.@nanobitz
suggested that it could be due to an issue with transformers and recommended updating. -
Multilingual Capabilities Enhancement Attempt:
@sadaisystems
asked about improving a modelās capabilities in a language other than English, receiving a response from@le_mess
that pre-training is necessary for significant improvement beyond what LoRA can offer. -
Inferencing with LoRA on the Fly:
@wizmak
sought a way to add LoRA adapters to a base model in real-time during inferencing, and@nanobitz
confirmed that with Hugging Face, you can load the peft model, but was unsure of the command to unload it. -
Model Parallelism with DeepSpeed Zero 3: User
@mihai4256
sought assistance for a working deepspeed zero 3 config for model parallelism, noting that existing ones from the repo werenāt functioning as expected for this particular use case.
Links mentioned:
GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.
OpenAccess AI Collective (axolotl) ā· #datasets (1 messages):
-
Duplicate Dilemma in Dataset Finetuning:
@_rxavier_
inquired about identifying if a text has been previously used to train a model. They asked about techniques for determining model familiarity with a text, possibly by examining the modelās response to an articleās introduction. -
The Impact of Training Data Overlap: Additionally,
@_rxavier_
questioned the implications of finetuning a model using data that may overlap with its pretraining dataset. They pondered the potential negative effects of such overlap on the finetuning process.
OpenAccess AI Collective (axolotl) ā· #runpod-help (1 messages):
- Axoltl RunPod Compatibility with Vast.AI: User
@dreamgen
successfully used the Axoltl RunPod image on Vast.AI, reporting that it worked out of the box.
LangChain AI ā· #announcements (1 messages):
- LangChain Introduces Journaling App with Memory:
@hwchase17
shared an early version of a journaling app that incorporates memory, using the LangChain memory module. The app is in an early stage and feedback is welcomed; it remembers information about users for future interactions, akin to the memory feature announced by OpenAI for ChatGPT today. Test the app here and check out the introductory video.
Links mentioned:
- Loom | Free Screen & Video Recording Software: Use Loom to record quick videos of your screen and cam. Explain anything clearly and easily ā and skip the meeting. An essential tool for hybrid workplaces.
- LangChain Companion - Journal: no description found
LangChain AI ā· #general (59 messagesš„š„):
- Seeking Assistance on LangChain with Android: User
@grindmaster2512
inquired about the integration of LangChain with an Android application, and followed up with@812757029260230658
to seek solutions to this query. - Efficient Chunk Pre-processing for Embeddings:
@swastikk
asked whether chunk pre-processing (like removing white spaces) is necessary before creating embeddings.@johnny2x2
confirmed that removing superfluous text aids the process, especially with email data. - PDF Parser Search, Alternatives to Adobe API:
@dejoma
requested recommendations for a PDF parser that can split contextually, expressing dissatisfaction with Adobe APIās limitations and seeking effective PDF API alternatives. - Calls to Improve LangChainās Documentation Structure:
@b0otable
provided feedback to the LangChain team suggesting the improvement in documentation structure by reducing example redundancies and updating syntax to avoid inefficient navigation for users. - Dependency Issues with Pinecone and LangChain: User
@segmentationfault.
experienced dependency resolution errors when trying to update Pinecone Database to v2 with a LangChain dependency, prompt response and solutions were provided by@jacoblee93
, a maintainer of LangChain.
Links mentioned:
- How to use function calling with Azure OpenAI Service - Azure OpenAI Service: Learn how to use function calling with the GPT-35-Turbo and GPT-4 models
- Pinecone | š¦ļøš Langchain: You can use Pinecone vectorstores with LangChain.
LangChain AI ā· #langserve (8 messagesš„):
-
Clarification on Langserve Scaling:
@kjoth_00356
inquired about scaling Langserve to multiple instances and asked about the difference between hosted Langserve and Langserve.@veryboldbagel
hinted at a deployment via hosted Langserve, leading to further clarification from@dachsteinhustler
who pointed to Langsmith as part of the solution hosted at Langchain Platform, which is in early testing and might require an invite code. -
In Search of NodeJS and Chain Integration:
@_mauricemoss
is looking for a way to expose a chain from a NodeJS app for use in a RemoteRunnable, but no solution has been provided within these messages. -
Disabling Intermediate Steps in Playground:
@dachsteinhustler
expressed a need to disable the intermediate steps in the Langchain playground to prevent browser crashes caused by large base64 strings, resulting in a workaround that involves using RunnableLambda. -
Connection Issues with k8s Cluster App:
@ezelanza.
described an issue where a connection is refused when attempting to invoke the OpenAI API through a k8s cluster-based application, mentioning that direct invocations to the back end work, but requests from the front end (React) fail, even with curl.
Links mentioned:
LangSmith: no description found
LangChain AI ā· #share-your-work (1 messages):
- Introducing Dewy and RAG with NextJS and OpenAI:
@kerinin
shared their contribution towards Dewy, an OSS knowledge base, along with a post detailing how to build a full-stack RAG application. The guide includes using NextJS, OpenAI API, and Dewy, aimed to minimize hallucinations and ensure accurate language model responses. Check out the blog post here.
Links mentioned:
Building a RAG chatbot with NextJS, OpenAI & Dewy | Dewy: This guide will walk you through building a RAG application using NextJS for the web framework, the OpenAI API for the language model, and Dewy as your knowledge base.
LangChain AI ā· #tutorials (2 messages):
- Seeking a Superior PDF Parser:
@dejoma
is looking for a PDF parser that can split contextually. Expresses discontent with Adobe API due to its low usage cap and lack of āpay-as-you-goā option; is open to suggestions for robust PDF APIs. - Langchain Calculator Quest:
@sougata
is building a calculator using Langchain that interprets multiplicative operations asmul(a,b)
. Requests guidance on how to integrate a custom Python library for calculation with the modelās augment function.
DiscoResearch ā· #general (29 messagesš„):
- Inquiry on Argilla Hosting Experience: User
@drxd1000
is seeking advice on hosting a server for Argilla that can support multiple users for annotation, but there was no resolution provided in the messages. - Layer Selective Rank Reduction Methodology Discussed:
@johannhartmann
referenced their own implementation of āLayer Selective Rank Reductionā to address continual training without forgetting, noting that āThey basically figure out the statistically less relevant parts of the layers and use them as lora targets,ā and considering it more efficient than continual approaches. A related GitHub repository was mentioned but not detailed in the conversation: laserRMT. - Out of Memory Issue with lm-evaluation-harness:
@philipmay
faced an OOM error evaluating a mixtral model and was advised by@bjoernp
to utilize multi-GPU support provided by lm-evaluation-harness, indicating two A100s might resolve the issue. - Search for German Toxicity Eval Dataset: User
@sten6633
inquired about a German toxicity evaluation dataset and pondered the utility of translating ToxiGen, a dataset available on Hugging Face for implicit hate speech detection. The dataset mentioned can be found on Hugging Face, but requires agreement for access: ToxiGen. - Novel Computational Technique Teased: User
@phantine
hinted at a new method excluding MoE, briefly titled āUniverses in a bottleā and hinted at a potentially radical claim: āP=NP.ā A GitHub link associated with@phantine
ās work was shared, but no specific details regarding the technique were provided: LargeWorldModel/LWM.
Links mentioned:
- Google Colaboratory: no description found
- skg/toxigen-data Ā· Datasets at Hugging Face: no description found
- GitHub - cognitivecomputations/laserRMT: This is our own implementation of āLayer Selective Rank Reductionā: This is our own implementation of āLayer Selective Rank Reductionā - cognitivecomputations/laserRMT
- GitHub - LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
DiscoResearch ā· #embedding_dev (5 messages):
- BM25 + Query + Rerank Combo Wins: User
huunguyen
highlighted their effective use of BM25 with additional querying and reranking steps for search purposes, and reported that this method āworks pretty good.ā - Wikipedia in a Nutshell:
huunguyen
managed to index the entirety of Wikipedia, excluding non-essential content, and compacted the BM25 index into a sleek size of under 3GB. - In Search of BM25 Tools:
sebastian.bodza
inquired about the specific libraryhuunguyen
is using to implement the BM25 algorithm for their search index.
DiscoResearch ā· #discolm_german (1 messages):
thomasrenkert: Is there an ETA for v2 of the German model? Or for the Mixtral variant?
CUDA MODE ā· #general (1 messages):
- GPU Shuffling for Experimentation:
@joseph_en
reported successful relocation of the Asus WS motherboard to the miner and is awaiting 16x PCI extenders. Theyāve utilized older GPUs for their experiments and have transitioned the minerās motherboard into the case, noting it handles 7B or 13B quantized models with a single 12G NVIDIA 3060 with ease.
CUDA MODE ā· #cuda (9 messagesš„):
- Cross-Compatibility Quest:
@iron_bound
kicks off a discussion about achieving binary compatibility for CUDA to run on HIP/ROCm platforms, referencing a Phoronix article on Radeon CUDA - ZLUDA. - CUDA for AMD GPUs? Meet ZLUDA:
@muhtasham
shares a GitHub link to ZLUDA, a project that aims to make CUDA run on AMD GPUs, sparking interest and a request for user experiences by@marksaroufim
. - Emoji Enthusiasm:
@muhtasham
invokes the spirits of the tech world through well-selected emojis of Jensen Huang and Lisa Su. - Market Monopolies and AGI Speculations:
@andreaskoepf
humorously suggests that Microsoftās purchasing strategy and a borked chip market could leave antitrust agencies unequipped against an AGI future. - Real-World Radeon Trials:
_tvi_
shares their experience with Radeon VII and a Ryzen APU, including struggles with dynamic memory allocation causing kernel crashes when handling large PyTorch data chunks.
Links mentioned:
- Tweet from [Phoronix] AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: Itās Now Open-Source Image (Radeon Cuda 1): no description found
- Tweet from AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: Itās Now Open-Source - Phoronix: no description found
- GitHub - vosen/ZLUDA: CUDA on AMD GPUs: CUDA on AMD GPUs. Contribute to vosen/ZLUDA development by creating an account on GitHub.
CUDA MODE ā· #algorithms (3 messages):
- Multidimensional Gated Recurrences Have Limitations: User
@euclaise
mentioned a constraint in multidimensional gated recurrences, stating that they require a DxCxN attention matrix which is quite prohibitive in cost, even with a small value for C. - Beyond Simple Linear Recurrences:
@euclaise
pointed out that prefix-sum-like scans have applications beyond computing simple linear recurrences, opening up a broader range of computational possibilities. - Twitter Insights on Computational Techniques:
@euclaise
shared insights on computational methods, including the use of maximal scans for sequences (y[t]=max(y[t-1], x[t])
), by providing links to their Twitter posts: Tweet on computational methods and Tweet on maximal scans.
CUDA MODE ā· #jobs (3 messages):
- Generative AI Startup Hiring in Hyderabad:
@gradman33
shared a job opportunity at an early stage Deep Tech Generative AI startup in Hyderabad, India, seeking talents in ML/Data/Research/SDE. Interested candidates can apply here. - Potential Spam Alert in Jobs Channel:
@pudding0377
flagged a post by@gradman33
as possibly irrelevant or spam, calling for the attention of moderators.
Links mentioned:
no title found: no description found
CUDA MODE ā· #beginner (9 messagesš„):
- New Member Alert:
@cs_os_05101
mentioned that they have a 4060 Ti. - Search for Engaging CUDA Books:
@euclaise
inquired about fun books related to CUDA, sparking a conversation about educational resources. - Shader Book Recommendation:
@marksaroufim
shared The Book of Shaders, a gentle guide to Fragment Shaders, as a possible fun read on a topic adjacent to CUDA. - Understanding User Expertise: After citing familiarity with shader programming,
@euclaise
clarified theyāre looking for materials directly related to compute shaders or CUDA, rather than frag shaders. - Looking for Fun in Learning: Both
@marksaroufim
and@euclaise
concurred that defining literature as āfunā can be subjective, but@marksaroufim
suggested PMPP as the best educational resource on CUDA, albeit not necessarily fitting the āfunā criterion.
Links mentioned:
The Book of Shaders: Gentle step-by-step guide through the abstract and complex universe of Fragment Shaders.
CUDA MODE ā· #pmpp-book (7 messages):
- Matrix Transposition Debate:
@eporat
asked if transposing one matrix in a multiplication could lead to fewer cache misses and thus faster computation.@andreaskoepf
responded, advising that while sequential memory access could be advantageous, the benefits might be negligible compared to tiled access. - Practical Test Yields No Benefits: Responding to the query about transposing matrices to speed up multiplication,
@jeremyhoward
recounted his experience stating that transposing during tile creation had no observable effect on performance. - In-Depth Discussion on Transposition:
@eporat
clarified that an inplace transpose isnāt necessary; sometimes, one only needs to adjust indice ordering in the inner loop, suggesting an alternative to transposition. - Further Clarification Sought:
@andreaskoepf
questioned@eporat
ās suggestion, implying that matrix elements are read transposed by default during multiplication, indicating a misunderstanding or need for further explanation on what@eporat
meant by adjusting loop indices.
CUDA MODE ā· #smol-hw (1 messages):
- Apple Silicon gets its own ātopā: User
@marksaroufim
shared a link to asitop, a performance monitoring CLI tool for Apple Silicon. It was compared to existing tools liketop
ornvtop
, tailored specifically for Appleās custom chips.
Links mentioned:
GitHub - tlkh/asitop: Perf monitoring CLI tool for Apple Silicon: Perf monitoring CLI tool for Apple Silicon. Contribute to tlkh/asitop development by creating an account on GitHub.
Latent Space ā· #ai-general-chat (24 messagesš„):
- Reka Model Announcement:
@swyxio
shared a link to a tweet about a new Reka model, creating a buzz in the community. The tweet can be found here. - Favorite VC Podcast Meets AI:
@swyxio
expressed enthusiasm for a VC podcast discussing AI topics, providing a link to the episode and highlighting its relevance to the community. - Exploring the BUD-E Voice Assistant by LAION:
@swyxio
discussed a new fully open voice assistant named BUD-E, developed by LAION, which is aimed to improve conversational experiences by being empathetic and context-aware. Details are available on the LAION blog. - What is an Agent?: In search of a definition for āagents,ā
@kaycebasques
asked the community for insights.@slono
described them as programs that aim to achieve goals with minimal user input. - Karpathy Leaves OpenAI:
@nembal
spotlighted news from The Information about Andrej Karpathyās departure from OpenAI, stirring curiosity about the implications for the AI field. Background on the development of an AI product for automating tasks mentioned by@slono
vaguely referenced AGI as a possible factor in the context of the departure.
Links mentioned:
- BUD-E: Enhancing AI Voice Assistantsā Conversational Quality, Naturalness and Empathy | LAION: <p>AI voice assistants have revolutionized our interaction with technology, answering queries, performing tasks, and making life easier. However, the stiltedā¦
- OpenAI Researcher Andrej Karpathy Departs: Andrej Karpathy, one of the founding members of OpenAI, has left the company, a spokesperson confirmed. Karpathy, a prominent artificial intelligence researcher, was developing a product he has descriā¦
- President and Co-Founder Anthropic, Daniela Amodei: AI Hurricane ā Grit ā Overcast: no description found
- Memory and new controls for ChatGPT: Weāre testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. Youāre in control of ChatGPTās memory.
- How Graph Neural Networks Are Transforming Industries: š Get your AssemblyAI API key here: https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_marco_1Graph Neural Networks (GNN) haā¦
- Tweet from Joanne Jang (@joannejang): š we just launched a small experiment for memory on ChatGPT. how it works - itās quite similar to custom instructions, except chatgpt is the one driving it (like auto vs. stick shift!) - basicalā¦
- GitHub - Stability-AI/StableCascade: Contribute to Stability-AI/StableCascade development by creating an account on GitHub.
- sta - Overview: sta has 2 repositories available. Follow their code on GitHub.
LLM Perf Enthusiasts AI ā· #opensource (2 messages):
- Choosing the Right Mistral Model Size:
@potrock
asked about the appropriate Mistral model size to run locally on an M2 Max with 32GB, seeking community input. - Safe Model Sizing Advice:
@natureplayer
suggested that 4GB is a safe size for local execution on the mentioned hardware, while 8GB will not work, and 5GB might be possible but is not guaranteed.
LLM Perf Enthusiasts AI ā· #openai (4 messages):
- GPT-5 Speculation Quelled: User
@res6969
humorously noted that the rumors of GPT-5 have been greatly exaggerated, indicating skepticism about its existence or imminent release. - Laughter is the Best Medicine?: Both
@res6969
and@potrock
shared lighthearted reactions with custom emoji and laughing-to-tears emoji, respectively, contributing to a jovial environment on the topic at hand. - A Memory Upgrade for ChatGPT:
@potrock
shared a blog post discussing new memory features being tested in ChatGPT that allow the model to remember user preferences and details across conversations, which users can manage conversationally or through settings.
Links mentioned:
Memory and new controls for ChatGPT: Weāre testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. Youāre in control of ChatGPTās memory.
AI Engineer Foundation ā· #events (6 messages):
- Weekly Meeting Kick-Off with a Sense of Humor:
@._z
announced the start of the weekly meeting with a playful note: š DĆ©jĆ vu. - Meeting Attendance Update:
@juanreds
informed they could not attend the weekly meeting, apologizing to the team. - Invitation to Co-host an AI Hackathon:
@caramelchameleon
asked if anyone is interested in co-hosting an AI developers hackathon, hinting at a collaboration opportunity with game developers before GDC this year. - Chance to Join Hackathon Online or Onsite:
@caramelchameleon
mentioned the possibility of attending the hackathon both online and onsite in San Francisco. - Eager Organizer Jumps In:
@yikesawjeez
expressed interest and requested to be contacted as they specialize in organizing hackathons, especially those associated with events in the Bay Area.
Skunkworks AI ā· #general (2 messages):
- Private Message Prompt: User
@bondconnery
requests a direct message with a simple ā<@1117586410774470818> DM sirā. - LLaVA Framework Inquiry:
@CodeMan
is seeking insights or experiences on integrating LLaVA with an SGLang server and SGLang worker, as opposed to using a standard model worker.