We have long contended that the RAG style operations have been used for context (knowledge base, facts about the world) and memory (running list of facts about you) will diverge. The leading implementation was MemGPT and now it seems to have rolled out in both ChatGPT (with a weirdly roon-y tweet. more details from Joanne Jang) and LangChain.

OpenAI:

LangChain:

In some sense this is just a crossing over of something the LMstudio/Sillytavern roleplay people have had for a while now. Expectation is that it will mildly improve UX but not lead to a big wow moment since the memory modeling is quite crude at the moment, not humanlike, and subject to context limits.

Table of Contents

[TOC]

PART 1: High level Discord summaries

TheBloke Discord Summary

Unbounded Textual Contexts: Engineers are exploring new open-source large language models like the Large World Model, which boasts coherence with contexts up to 1 million tokens. Discussions include language support, as in Cohere's aya model, which covers 101 languages, and challenges working with jax-based tools during model installations.
Nurturing Erotically Programmed Role Play: The community is dissecting performances of re-quantized Miqu models like MiquMaid-v2-70B, attuned for Erotic Role Play (ERP). Emphasis was on the impact of enhanced hardware, with a jump from 0.7t/s to 2.1t/s in tokens per second while using 12GB VRAM GPUs.
Instruct, Optimize, Repeat: Finetuning techniques explained include using Sequential Fine-Tuning (SFT) followed by Direct Preference Optimization (DPO) as improved RLHF/PPO, detailed on page 6 of a paper. Unsloth AI’s apply_chat_template is touted over Alpaca to train LLMs for multi-turn conversations.
JavaScript Meets Python in AI Development: Experimentation with JSPyBridge led to successful bridging of JavaScript and Python in expanding the SillyTavern project. This included addressing Windows-specific errors, like cpu_embedding=True to circumvent access violation issues and integrating Python classes asynchronously into JavaScript code.
Confounding Losses in Model Training: An engineer observed an unexplained variance in training loss when finetuning Mixtral 8x7b qlora, resulting in higher losses compared to Mistral 7b despite similar datasets and hyperparameters. The matter remains open for community input or similar experiences.

LM Studio Discord Summary

Large Models Crying for RAM: Users like @nonm3 and @theoverseerjackbright battled errors loading large models in LM Studio due to limited RAM and VRAM. Suggestions included trying smaller model quants, and some faced GPU detection issues with LM Studio, prompting restart crashes.
MedAlpaca Heads to the Clinic: Discussions on medical LLMs saw medAlpaca, a fine-tuned model for medical question answering, as a promising addition to @pepito92i's medical project. Microsoft's phi-2 model's absence from LM Studio was noted, with the possibility of it being converted to .gguf format by user TheBloke to use with llama.cpp.
GPU Matchmaking and Overclocking: Hardware enthusiasts like @luxaplexx questioned NVLink's role in memory cycling, ultimately suspecting cards like the 960 might not be NVLinked. Users discussed GPU upgrades for better performance with models, considering options like the RTX 3060 12GB. Others like @alastair9776 and @rugg0064 weighed the benefits and risks of overclocking for faster token generation.
Quant Leap Forward: Eager anticipation for IQ3_XSS support in LM Studio, with users like @n8programs and @yagilb expecting it in the next update. A GitHub pull request reflected the community's excitement over upcoming 1.5 bit quantization. Meanwhile, preparations were suggested for downloading forthcoming, as-yet-unsupported models like IQ3_XSS.
Beta Release Relief: @rafalsebastian ran into a stumbling block running LMstudio on CPUs with only AVX support. @heyitsyorkie provided hope by directing to the 0.2.10 AVX beta release for Windows that enables compatibility, while still recommending an upgrade to AVX2 for optimal performance and offered a helpful link.
Multi-Model Management Mystery: @alluring_seahorse_04960 sought advice on running dual models simultaneously on one machine to avoid repetition errors, using a Conda environment but steering clear of VMs. The nature of the repetition errors in question was humorously prodded by @wolfspyre, awaiting further details.

LAION Discord Summary

Magvit V2 Sparks Interest and Debate: Engineers delved into the technicalities of reproducing the Magvit V2 model, with discussions focusing on appropriate datasets, parameters for video compression and understanding, and the mention of experiments on the lfq side of Magvit2. The community also saw a surge in interest around MAGVIT, likely due to influencer mentions.
Scrutinizing Stable Cascade's Efficacy: Stability AI's Stable Cascade model spurred intense conversations regarding its high VRAM requirements, optimization issues, and erroneous inference time graphs. Technical issues reported included challenges with text clarity in images and the inability to run models in float16, alongside performance evaluations on GPUs like the 3090.
Legal Frays in AI-Generated Imagery: The community engaged in a heated discussion about copyrights and the legality of AI-generated images, highlighting a TorrentFreak article about a court dismissing authors’ copyright infringement claims against OpenAI.
Ethical Conundrums with AI and Adult Content: The conversation shifted to the role of adult content in driving technological progress, with some participants recognizing the historical pattern while others doubted its constructive impact on AI. Topics included the rise of non-consensual deepfake pornography, its market dynamics, and the potential ethical pitfalls plaguing the AI community.
Calls for Higher AI Image Standards: Discussions included technical insights into improving AI image generation, such as the viability of VAE encoder training. Members also reflected on the community's photorealism standards, expressing the need for better quality in AI-generated images.

Eleuther Discord Summary

Checksum Hunting Season Open: @paganpegasus provided checksums for The Pile zst shards and pointed to EleutherAI’s hashes and the Discord pins.
Image Content Classification Tools Discussed: OwlViT and CLIP models were recommended as tools for discerning the content of images and the concept of "nothing" in imagery was discussed due to an inquiry by @everlasting_gomjabbar.
Paper Review in Collaborative Spirit: A user received appreciative feedback on a manuscript titled "Don't think about the paper," with the EleutherAI Discord community being credited in the paper's acknowledgements.
Cloud Computing Resources Examined: GCP and Colab surfaced as favorable cloud resources for NLP classification model training, with discussions encompassing cost-benefit analyses of platforms like runpod and vast.ai.
Research Computing Power Up for Grabs: EleutherAI's computational resources were said to be available for collaboration on a semi-custom LLM project, with the caveat of having a clear research agenda and collaborative value proposition.
Semantic Scholar's Linking Logic Revealed: Arxiv papers are automatically linked to authors on Semantic Scholar, with room for manual corrections to ensure accuracy.
Fractal Fun with Neural Training Parameters: @jbustter shared fractals created from neural network hyperparameters, highlighting a blog by Jascha Sohl-Dickstein that correlates fractals with training convergence/divergence.
A Deep Dive into Data Presentation for ML: A discussion was sparked concerning active learning and methods for models to choose their own data presentation sequence.
Enriching Encoder-Decoder Models with Unsupervised Data: Strategies to employ unsupervised datasets effectively in encoder-decoder models were discussed.
New NLP Robustness Method Flies Off the Press: A paper focusing on test-time augmentation (TTA) to enhance text classifiers' robustness was published, with the author thanking the community for support.
The Quest for Interpretability Insight: @jaimerv asked for updated resources on interpretability methods beyond the standard Representation Engineering paper.
Summoning Collaborators for Hallucination Leaderboard: A call for contributions to a hallucinations leaderboard was made, requesting assistance with tasks, datasets, metrics, and result evaluations.
Aligning Pythia with Practice: Concerns were aired about potential misalignments between training batches and checkpoints in the 2.8b size Pythia deduped suite, with follow-up discussions suggesting opportunities for a publication on Pythia's reliability.

LlamaIndex Discord Summary

LlamaIndex v0.10 Marks Major Milestone: LlamaIndex v0.10 has been released, presenting notable advancements including a new llama-index-core package and PyPi packages for every integration/template. Detailed information on migration is accessible through their comprehensive blog post and documentation.

Webinar on No-Code RAG with LlamaIndex: A webinar demonstrating the creation of no-code Retrieve and Generate (RAG) apps using LlamaIndex.TS is set up with Flowise co-founder Henry Heng. Registration for the Friday event is available here.

Troubleshooting LlamaIndex: Engineers faced challenges with migration following LlamaIndex's update and were pointed to a Notion migration guide for assistance. Furthermore, for configuration queries like chunk_size post-ServiceContext depreciation, engineers are advised to refer to the new Settings documentation and relevant LlamaIndex GitHub resources.

RAG App Building with Dewy Tremendously Simplified: A comprehensive guide to building a full-stack RAG app using NextJS, OpenAI, and the open-source knowledge base Dewy has been shared. The tutorial is aimed at grounding language models in precise, reliable data and can be studied in detail here.

Handling Document Complexity and Enhancing Enterprise with LlamaIndex: Users engaged in discussions about filtering complex documents and integrating LlamaIndex to enhance enterprise efficiency with tools such as Slack, Jira, and GDrive. Also, creating multiple agents for merging different document sources was considered, referencing the possibility of using traditional indexing techniques instead of high-cost LLMs for dynamic filtering.

HuggingFace Discord Summary

Hugging Face Accelerates with Message API: Hugging Face launched a new Message API compatible with OpenAI, aimed at streamlining use of inference endpoints and text generation services with client libraries. They've also advanced their offerings with new releases like Datatrove on PyPI, Gradio 4.18.0, and tools like Nanotron and Accelerate 0.27.0 for 3D parallelism training. Additional partnerships and resources, such as a Codecademy AI course and a blog post on SegMoE, support the continuous learning and innovation in their community.
Search Engine Woes and Hosting Queries in Focus: Technical discussions spotlighted the difficulties in creating search engines with mentions of approaches like TF-IDF and BM25, and the use of spaCy for Part of Speech tagging. Other conversations pivoted to queries about hosting custom models and serverless inferencing solutions, as well as the practicality of running 100B+ parameter models on enthusiast-level hardware.
Template Talk and Model Deployment Discussions: Users addressed the need for a simple chatbot development prototype capable of database interaction and email API integration, featuring resources like Microsoft's AutoGen on GitHub and the potential of AutoGen Studio. Challenges around deploying finetuned machine learning models such as mistarl_7b_gptq for fast inferences were raised, with emphasis on choosing the right platforms or libraries for the task.
Glimpse into Creator Innovations: Members of the community showcased their creative projects, including GIMP 3.0 plugins interfacing with Automatic1111, development of an automated image tagging model for diffusion tasks, and updates to tools like PanoramAI.xyz introducing a "remix" mode for image transformations. Excitement built around AI-applications in fashion design as well, demonstrating the breadth of applications being pursued.
Analyzing S4 and Advancing NLP: The community shared their insights into the S4 architecture ideal for long-range sequences and sought clarity on its implementation. The paper on LangTest got introduced, which offers testing and augmenting capabilities for NLP models. Topics extended to extracting language identifiers from models like XLM-RoBERTa and converting natural language into formal algebraic expressions.
Enthusiasm for Diffusion and Emerging Models: Conversations sparked around facilitating multi-GPU training for diffusion model fine-tuning, with mentions of scripts such as train_text_to_image.py. The successful deployment of models like mistarl_7b_gptq for fast inference, and effective text generation with stable cascade were discussed. The buzz was palpable around the teased development of a new terminus model.
Complications in Computer Vision Explored: The channel delved into challenges like hierarchical image classification, with resource suggestions including an ECCV22 paper on the same. Members discussed requirements for Gaussian splats, industry-grade image retrieval systems and sought collaboration on multimodal projects.

Nous Research AI Discord Summary

LongCorpus Dataset Unveiled for Pre-Training: The new LongCorpus-2.5B dataset is released, featuring 2.5 billion tokens from various domains, specifically curated for long-context continual pre-training and designed to minimize n-gram similarity with training sets.
Coherence Preserved in Scaling Models: Scaling with 'self-extend' is considered superior over 'rope scaling' for maintaining coherence, as indicated by the implementation in llama.cpp, and offers the benefit of requiring no setup, fine-tuning, or additional parameters.
Persistence and Resistance in AI Agents and Models: LangGraph agents can persist their state across interactions, as shown in a YouTube demonstration, while the Gemini model shows resistance, with its refusal tendencies prompting comparisons unfavorable to GPT-4.
Multimodal AI Breakthrough with Reka Flash: Reka Flash, a new 21B fast multimodal language model, is introduced and now available in public beta, promising to measure up to major models like Gemini Pro and GPT-3.5. The initiative can be followed on Reka AI's announcement page.
CUDA Pains and WAVeform Gains in AI Research: The ZLUDA project aimed to run CUDA on AMD GPUs can no longer be considered active, and a fresh perspective in AI research proposed in an arXiv paper, suggests wavelet transforms could enhance Transformers by addressing both positional and frequency details efficiently.

Mistral Discord Summary

Newbies Get Model Recommendations: Participants recommended instruct models for chat-GPT-like interactions to newcomer @nana.wav, with the clearer instruction-following focus as opposed to the more general autocompletion capabilities of other models.
RAG Setup and Model Debates Heat Up: A guide on integrating Mistral with RAG was shared, while the effectiveness of LangChain vs. LlamaIndex was debated; separately, DSPy was touted for leveraging LLMs for programming rather than chatting, adorned with a supportive Twitter link.
Deployment Dilemmas and Solutions: Docker deployment via ollama or vllm projects was suggested, while others discussed API alternatives and faced cloud quota barriers; meanwhile, success stories involved deploying Mixtral on HuggingFace despite the hiccups with AWQ quantization.
Fine-Tuning Finesse and RAG Revelations: Users discussed fine-tuning vs. RAG with insights into LLM base knowledge importance; guidance was given on input data structuring for LLM output enhancement and queries about prompt versioning tools surfaced.
Humans in Tech and AI Seek Touchpoints: French librarian (@maeelk) sought internship opportunities in psychology and AI; the cost of innovatively building audio-inclusive S2S models sparked discussions around budget constraints and investment needs.
Technical Troubles and Support Suggestions: @ingohamm faces hurdles with TypingMind's API key and a suggestion was made to contact [email protected] for assistance with API and subscription issues.

Perplexity AI Discord Summary

Perplexity AI Outshines Rivals in Complex Query Handling: @tbrams tested Perplexity AI with a difficult question from the "Gemini" paper and found it outperformed Google's Gemini service and OpenAI, answering more quickly. The test results from Perplexity AI are documented here.
Perplexity's Potential in API Customization Highlighted: The PPLX API allows for custom search queries using parameters like "site:reddit.com OR site:youtube.com", as mentioned by @me.lk. However, several users have encountered issues with the API such as performance hiccups (@andrewgazelka) and nonsensical responses (@myadmingushwork_52332).
Perplexity AI Subscription and Renewal Queries Addressed: Users are seeking details on trial subscriptions and renewal processes for Pro subscriptions, with inquiries about token refresh rates also surfacing. There is currently no early access program for new Perplexity features as confirmed by @icelavaman.
Promising Enhancements and Community Collaborations: Perplexity AI is receiving community praise for tools like the pplx shortcut action (@twodogseeds). Meanwhile, @ok.alex is encouraging a community-driven effort to contribute to an alternative feed/newsletter Alt-D-Feed.
Seeking Direct Support Channel for Sensitive Data Issues: A user (@kitsuiwebster) has expressed the need for direct assistance with a sensitive company data issue, avoiding public disclosure while lacking response from support channels.

OpenAI Discord Summary

ChatGPT Remembers Your Favorite Color: OpenAI announced a new memory feature for ChatGPT, rolling out to select users, enabling ChatGPT to remember user preferences and details over conversations for a more personalized experience. Users can control what ChatGPT remembers and can switch off this feature.
AI-Assistants in Creative Process Paid Talks: A UK researcher, @noodles7584, is looking to compensate community members for a 30-minute discussion on AI use in creative workflows.
Performance Quirks in GPT Variants: The community reported fluctuations in GPT-4's task handling, and Abacus.AI's Smaug-72B was noted for outperforming GPT-3.5, while ChatGPT-4 seems hesitant to generate full code snippets.
Fine-Tuning AI to Watch Videos? Not Yet: Discussion in #gpt-4-discussions clarified that while GPT can describe images from a video with its vision capabilities, it cannot yet be fine-tuned for video-specific knowledge or tasks.
Exploring and Perfecting Prompt Engineering: Good prompt engineering was highlighted as involving clear instructions and precision, with a focus on fostering simple storytelling in text-based AI adventures and recognizing differences between prompt engineering and API infrastructure development.

OpenAccess AI Collective (axolotl) Discord Summary

Axolotl Embraces MPS, Thanks to GitHub Heroes: Maxime added MPS support in the axolotl project via pull request #1264, referencing the importance of a PyTorch pull request #99272. Clarification on contributor identities highlighted the importance of collective recognition in open source.
Chat In The Time Of Datasets: The MessagesList standard for chat datasets proposed by @dctanner aims for cross-compatibility and is under discussion. The format might include conversation pairs, greetings, and assistant-initiated closures, with challenges noted in JSON-schema validation.
Axolotl Tokenized Right, Check the Debug Flag: Users are troubleshooting tokenization within axolotl, with advice to inspect the tokenizer configs and a recommendation to use a debug flag for verification.
Model Query Woes and Training Queries Grow: Queries about improving model's multilingual capabilities, LoRA adapter inferencing, and model parallelism were discussed, with solutions ranging from pre-training needs to updates in transformers and DeepSpeed Zero 3 configs for better functionality.
Fine-tune or Re-train? Duplicate Data's Pain: The impact of training data overlap and finetuning practices were questioned, highlighting concerns about reusing text that a model may have encountered during pretraining.
RunPod Image on Vast.AI, A Smooth Sail!: The Axoltl RunPod image was reported by @dreamgen to work seamlessly on Vast.AI, underscoring the inter-operability with cloud infrastructure providers.

LangChain AI Discord Summary

LangChain Unveils Memory Journaling App: @hwchase17 introduced a new journaling app featuring LangChain memory module, inviting feedback for the early version akin to OpenAI's ChatGPT with memory feature. Try and give feedback using this journal app and watch the intro video.
LangChain Community Tackles Diverse Technical Challenges: Topics covered included the possibility of LangChain's Android integration, pre-processing benefits for efficient embeddings, the search for a capable PDF parser, and calls for improved documentation structure. Additionally, a user faced dependency issues while updating Pinecone Database to v2 with LangChain, which was promptly addressed.
Scaling and Integration Enquiries in Langserve Channel: Discussions included questions about scaling Langserve and using Langsmith for deployment. There was a query about exposing a chain from a NodeJS app and an unaddressed issue regarding disabling intermediate steps in the playground. Connection issues with an OpenAI API call from a k8s cluster-based app were also described.
Dewy RAG Application with NextJS and OpenAI Guide Shared: @kerinin contributed a guide exploring a full-stack RAG application, utilizing NextJS, OpenAI API, and Dewy, focusing on reducing hallucinations and improving model response accuracy. The full guide is available here.
Quest for a Functional PDF Parser and Custom Calculator: Within the tutorials channel, the search for a superior contextual PDF parser to Adobe API, and guidance for building a Langchain-based calculator were topics of discussion, aiming for practical integrations and solutions in AI workflows.

DiscoResearch Discord Summary

Seeking Argilla Hosting Solutions: @drxd1000 requested advice for hosting an Argilla server capable of supporting multiple annotators with no clear resolution reached.
Layer Selective Rank Reduction in the Spotlight: @johannhartmann discussed an implementation of 'Layer Selective Rank Reduction' for mitigating continual training forgetting. The method targets statistically less significant layer parts, and a GitHub repository was mentioned.
Overcoming OOM With Mixtral: @philipmay encountered an Out of Memory error with a mixtral model, and @bjoernp suggested using multi-GPU support, mentioning that two A100s might alleviate the issue.
Cross-Language Toxicity Detection Dataset: @sten6633 sought a German toxicity evaluation dataset, considering the translation of ToxiGen from Hugging Face, which requires access agreement.
New Computational Technique Teased: @phantine teased a technique named "Universes in a bottle" with implications for the P=NP problem, linked to a GitHub page, but details were sparse.
BM25 Search Strategy Proves Effective: huunguyen reported success using BM25 with additional querying and reranking to enhance search capabilities, and successfully indexed the entirety of Wikipedia into an index under 3GB.
German AI Model Update Inquiry: thomasrenkert asked about the release timeline for version 2 of the German model or a Mixtral variant, but no additional details were provided.

CUDA MODE Discord Summary

CUDA Compatibility Crusade: Members discussed achieving CUDA binary compatibility on HIP/ROCm platforms, driven by the ZLUDA project on GitHub, which is a CUDA on AMD GPU initiative. Amidst technical emoji enthusiasm, there were musings about market monopolies and AGI, alongside personal experiences with Radeon hardware issues related to dynamic memory allocation.
Generative AI Jobs Galore: A Deep Tech Generative AI startup in Hyderabad is hiring ML, Data, Research, and SDE roles, with applications welcomed here. However, the legitimacy of the job posting was questioned, flagging the need for moderator attention.
Compute Shaders and Matrix Math Musings: Inquiries on educational materials for CUDA led to The Book of Shaders recommendation, while the discussion in the PMPP book channel debated the benefits, or lack thereof, of transposing matrices to reduce cache misses in multiplication, indicating varied opinions but no consensus on observed benefits.
Apple Chips Enter Monitoring Realm: @marksaroufim shared asitop, a CLI performance monitoring tool designed for Apple Silicon, likening it to top or nvtop in utility for engineers leveraging Apple's technology.
GPU Experiments and Job Shuffling: An engineer successfully relocated an Asus WS motherboard to a miner setup, effectively running large quantized models on a NVIDIA 3060 GPU. This indicates a hands-on approach within the community towards custom hardware configurations.

Latent Space Discord Summary

Reka Enters the Model Arena: A new AI entity named the Reka model has sparked interest in the community following a tweet shared by @swyxio. The excitement is palpable with discussions around the tweet found here.
Investor Insights Meet AI: @swyxio spotlighted a VC podcast delving into AI, which could be of significant interest to engineering aficionados. The podcast episode is accessible here.
BUD-E Buzz: BUD-E, an empathetic and context-aware open voice assistant developed by LAION, could signal a new direction in conversational AI. More details are laid out on the LAION blog.
Pondering the Definition of Agents: The community exchanged views on defining "agents," with @slono suggesting that they are goal-oriented programs that require minimal input from users, a concept significant in the realm of AI development.
Karpathy's OpenAI Exit Raises Questions: The AI community is abuzz over the news of Andrej Karpathy leaving OpenAI, with @nembal pointing to an article from The Information and speculation about AGI influences. The article is accessible here.

LLM Perf Enthusiasts AI Discord Summary

Minding the Model Size for M2 Max: @potrock inquired about running Mistral model sizes on an M2 Max with 32GB, and @natureplayer advised that a 4GB model would be the feasible option, cautioning against an 8GB model and noting that 5GB models may be unstable.
GPT-5 Rumor Mill: @res6969 expressed humorous doubt about GPT-5's existence, suggesting that speculation on the model's upcoming release is overstated, with others joining the jest with emojis.
Enhanced Memory in ChatGPT: @potrock highlighted a new feature tested in ChatGPT, based on a blog post, where it can retain user preferences and information across sessions for more personalized interactions.

AI Engineer Foundation Discord Summary

Weekly Sync-Up Teases Déjà vu: @._z playfully announces the start of the weekly team meeting likening it to a recurring Déjà vu experience.
Member Bows Out from Meeting: @juanreds sends regrets for being unable to attend the weekly meeting, offering apologies to the team.
Call for AI Hackathon Co-Hosts: @caramelchameleon seeks collaborators to co-host an AI developers hackathon in tandem with game developers in the lead-up to the GDC.
Hackathon Offers Dual Attendance Modes: The hackathon mentioned by @caramelchameleon has options for attendance both online and onsite in San Francisco.
Hackathon Organizer Steps Up: @yikesawjeez shows eagerness to get involved in organizing the hackathon and highlights their expertise with Bay Area events.

Skunkworks AI Discord Summary

Direct Messaging Initiated: User @bondconnery has put out a request for a private message.
Exploring LLaVA Framework Integration: @CodeMan inquired about integrating the LLaVA framework with an SGLang server and SGLang worker, aiming for a potentially more specialized setup than a conventional model worker.
Off-Topic Video Share Ignored: A non-technical video link was shared, not relevant to the engineering discussions.

The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

The Datasette - LLM (@SimonW) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1460 messages🔥🔥🔥):

Exploring the Limits of Large Language Models: Users are discussing new open-source large language models capable of handling extremely long contexts, such as the Large World Model which claims to work coherently with contexts up to 1 million tokens. There are also mentions of the Cohere's aya model that supports 101 languages.
The Quest for Efficient Multimodal AIs: Conversations focus on multimodality in AI with references to models handling visual inputs and potential outputs, indicating significant advancements beyond text-based models. The jax-based tools required to run the models are causing installation hiccups for some users.
Models Under Scrutiny: The community is very active in testing released models, mentioning issues such as TUX dependency problems and a ValueError during setup, indicating some challenges in getting the advanced models running smoothly.
Users Share Knowledge: Experienced users offer insights and assistance on how to handle models and UIs for various tasks, including long-context quantization in existing frameworks like ExLlama v2. Discussions also touch on the possibility of banishing stop tokens to encourage longer continuous outputs.
Towards Intelligent Role-Playing: There is a discussion on finding the balance between RP-oriented models and smarter generalized ones, with mentions of a Mixtral variant (BagelMIsteryTour) that might better fulfill user requirements for intelligent and adaptable model behavior.

Links mentioned:

Context – share whatever you see with others in seconds: no description found
Lil Yachty Drake GIF - Lil Yachty Drake - Discover & Share GIFs: Click to view the GIF
Memory and new controls for ChatGPT: We’re testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. You’re in control of ChatGPT’s memory.
brucethemoose/LargeWorldModel_LWM-Text-Chat-128K-55bpw · Hugging Face: no description found
YOLO: Real-Time Object Detection: no description found
Kooten/BagelMIsteryTour-v2-8x7B-5bpw-exl2 · Hugging Face: no description found
no title found: no description found
Think Bigger Skeletor GIF - Think Bigger Skeletor Masters Of The Universe Revelation - Discover & Share GIFs: Click to view the GIF
no title found: no description found
CausalLM/34b-beta · Hugging Face: no description found
SimSim93/CausalLM-34b-beta_q8 · Hugging Face: no description found
GitHub - jy-yuan/KIVI: KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache: KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache - jy-yuan/KIVI
The Verge: How NOT to Build a Computer: SPONSOR: Go to http://expressvpn.com/science to take back your Internet privacy TODAY and find out how you can get 3 months free.Link to the Verge's awful vi...
LWM/lwm/llama.py at main · LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
https://drive.google.com/drive/folders/1my-8wOIYXmfnlryDbwJ20_y6PFCqRfA-?usp=sharinghttps://drive.google.com/drive/folders/1my-8wOIYXmfnlryDbwJ20_y6PFCqRfA-?usp=sharingData Challenge - Aether 2024: In order to participate in the Data Challenge organised by Enigma as part of Aether, Please fill out this form Event Date & Time: Wednesday, February 14th - 2:30 pm Please double-check your detail...
LargeWorldModel (Large World Model): no description found
GitHub - lhao499/tux: Tools and Utils for Experiments (TUX): Tools and Utils for Experiments (TUX). Contribute to lhao499/tux development by creating an account on GitHub.
GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.: A multimodal, function calling powered LLM webui. - GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.
llama.cpp/examples/server at master · ggerganov/llama.cpp: Port of Facebook's LLaMA model in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.
Tweet from Cohere For AI (@CohereForAI): Today, we’re launching Aya, a new open-source, massively multilingual LLM & dataset to help support under-represented languages. Aya outperforms existing open-source models and covers 101 different la...
OpenAI Researcher Andrej Karpathy Departs: Andrej Karpathy, one of the founding members of OpenAI, has left the company, a spokesperson confirmed. Karpathy, a prominent artificial intelligence researcher, was developing a product he has descri...
CohereForAI/aya-101 · Hugging Face: no description found
GitHub - valine/NeuralFlow: Contribute to valine/NeuralFlow development by creating an account on GitHub.
ChatGPT but Uncensored and Free! | Oogabooga LLM Tutorial: ChatGPT but uncensored and free, well its now possible thanks to the open source AI community! In this video I show you how to set up the Oogabooga graphical...
LWM/lwm/vision_chat.py at main · LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
New emails reveal scientists believed COVID-19 was man-made: New emails have revealed scientists got together to discuss the origins of COVID, suspecting it was man-made, before deciding to tell the public it originate...
GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.: A multimodal, function calling powered LLM webui. - GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.
GitHub - LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.
GitHub - acorn-io/rubra: AI Assistants, LLMs and tools made easy: AI Assistants, LLMs and tools made easy. Contribute to acorn-io/rubra development by creating an account on GitHub.
unalignment/weeeeee.0 · Hugging Face: no description found
unalignment/weeeeee.1 · Hugging Face: no description found
unalignment/weeeeee.2 · Hugging Face: no description found
CohereForAI/aya_dataset · Datasets at Hugging Face: no description found
CohereForAI/aya_collection · Datasets at Hugging Face: no description found

TheBloke ▷ #characters-roleplay-stories (154 messages🔥🔥):

Exploring Miqu Variants: @superking__ opened a discussion about the performance of the Miqu models, particularly after being re-quants from the original GGUFs (Google's Generative Unsupervised Feature extraction). @soufflespethuman mentioned MiquMaid-v2-70B, a variant specifically fine-tuned for ERP (Erotic Role Play), and provided sensitive content links to various versions on Hugging Face, which have been marked due to their nature.
Performance Gain with Better Hardware: @superking__ shared their experience on performance improvement from "painfully slow" to "almost usable" by upgrading their hardware to 12GB VRAM, which changed the given model's tokens per second from 0.7t/s to 2.1t/s.
Model Comparisons and Recommendations: In the context of roleplay and storytelling, users discussed various models. @spottyluck praised Nous Capybara Limarpv3 34B for its capabilities and provided a link to the model on Hugging Face. @wolfsauge shared a sketch about "The Continental" featuring Christopher Walken and @eqobaba inquired about appropriate models and settings for engaging in NSFW ERP, mentioning a specification of 48GB VRAM and RTX A600.
Discussing Model Output Improvement: @neriss suggested using a higher temperature or lower minimum probability to reduce repetition and improve creativity in AI model outputs. The conversation highlighted variations in temperature settings, with @dercheinz suggesting higher temperatures, while @neriss advised lower ones, each to counteract repetitive or uncreative responses from models.
Dataset Cleaning Challenges and Strategies: @c.gato and @potatooff exchanged thoughts on cleaning datasets manually, with @c.gato seeking advice on how to perform ngram analysis to prevent overtraining on specific ngrams. @mrdragonfox recommended using Python's pandas library for handling tabular or JSON data, sharing a gist for guidance.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
TheBloke/Nous-Capybara-limarpv3-34B-GGUF · Hugging Face: no description found
Models - Hugging Face: no description found
gist:f786564868357cde5894ef6e2c6f64cf: GitHub Gist: instantly share code, notes, and snippets.
The Continental: Anticipation - Saturday Night Live: Subscribe to SaturdayNightLive: http://j.mp/1bjU39dSEASON 26: http://j.mp/14GYJ6nThe night air is tinged with anticipation. It's time to meet The Continental...
Happy Fun Ball - SNL: Happy Fun Ball seems great until you hear all the potential side effects. [Season 16, 1991]#SNLSubscribe to SNL: https://goo.gl/tUsXwMStream Current Full Epi...
NeverSleep/MiquMaid-v2-70B · Hugging Face: no description found
NeverSleep/MiquMaid-v2-70B-GGUF · Hugging Face: no description found
NeverSleep/MiquMaid-v2-70B-DPO · Hugging Face: no description found
NeverSleep/MiquMaid-v2-70B-DPO-GGUF · Hugging Face: no description found

TheBloke ▷ #training-and-fine-tuning (43 messages🔥):

Understanding Finetuning Techniques: @starsupernova explained that Mixtral – Instruct was trained using SFT on an instruction dataset followed by Direct Preference Optimization (DPO) on a paired feedback dataset, as detailed on page 6 of their paper. DPO is described as an optimized form of RLHF/PPO finetuning.
Unsloth AI's Apply Chat Template: @starsupernova, likely the founder of Unsloth AI, highlighted the use of apply_chat_template instead of Alpaca for training an LLM on multi-turn conversation datasets. They also hinted at uploading a new notebook with all chat templates to simplify the process.
Augmentoolkit for Instruct-Tuning Datasets: In the conversation, @mr.userbox020 shared a link to a GitHub repository offering a toolkit to convert Compute and Books Into Instruct-Tuning Datasets. Although @starsupernova was not familiar with it, they suggested trying it out as it appeared promising.
Anticipation for Updated Training Resources: @avinierdc is awaiting an updated notebook from @starsupernova for fine-tuning Mistral on multi-turn conversation datasets. @starsupernova assured they would ping when it's available on the Unsloth's Discord server.
Unexplained Variation in Training Loss: @dreamgen reported observing a 2x higher training and evaluation loss when fine-tuning Mixtral 8x7b qlora compared to Mistral 7b, despite using the same dataset and similar hyperparameters, and inquired if others had seen something similar.

Links mentioned:

GitHub - e-p-armstrong/augmentoolkit: Convert Compute And Books Into Instruct-Tuning Datasets: Convert Compute And Books Into Instruct-Tuning Datasets - e-p-armstrong/augmentoolkit

TheBloke ▷ #coding (8 messages🔥):

Interest in Collaboration Sparked: @_b_i_s_c_u_i_t_s_ expressed interest in an unspecified topic, potentially around chatbot implementation, which was well received by @mr_pebble, finding it motivating to progress on implementing various chat methods.
Bridging JavaScript and Python: @spottyluck experimented with expanding the SillyTavern project to use a JavaScript-Python bridge, utilizing JSPyBridge to potentially adapt and enhance functionalities. They shared how it enabled testing of Microsoft's LLMLingua, despite some issues with prompt mangling.
Using Python Classes in JavaScript: @spottyluck provided code examples illustrating the ease of creating Python classes within JavaScript using JSPyBridge, along with an asynchronous function, compressPrompt, which demonstrates the interaction between languages to compress prompts.
Modifications to Handle Windows Errors and Devices: In their continued development, @spottyluck modified Intel's BigDL.llm transformer to support specific requirements, like cpu_embedding=True on Windows due to access violation errors, and dealing with model device allocation issues using model.to().
Compression Process Integration into Routing: @spottyluck explained integrating prompt compression into their web service by adding a flag to the /generate router post and using conditional logic to process the prompt through the bridge, demonstrating how Python can operate as if it were a JavaScript class.

Links mentioned:

GitHub - extremeheat/JSPyBridge: 🌉. Bridge to interoperate Node.js and Python: 🌉. Bridge to interoperate Node.js and Python . Contribute to extremeheat/JSPyBridge development by creating an account on GitHub.

LM Studio ▷ #💬-general (202 messages🔥🔥):

Struggles with Large Models: Users like @nonm3 encountered errors while loading large models in LM Studio due to insufficient RAM and VRAM, with suggestions to try smaller model quants. Others like @theoverseerjackbright faced issues with LM Studio not detecting GPUs correctly and crashing post-restart.
Software Seekers and Recommendations: @tvb1199 was in search of client software that can interact with LM Studio for RAG capabilities, and was pointed towards AGiXT, while @pierrunoyt and others discussed Nvidia's 'Chat with RTX' with RAG features as a potential game-changer.
Compatibility Inquiries: Several users such as @wizzy09 had trouble installing or opening LLM Studio on unsupported platforms like a 2014 MacBook Pro, with clarifications from users like @heyitsyorkie explaining that LMStudio does not work on Intel Macs.
Nvidia's Chat with RTX Triggers Interest: The community showed a keen interest in Nvidia's 'Chat with RTX'. Users like @hypocritipus were intrigued by the RAG feature, hoping for a similar easy-install, no-dependency RAG feature in LM Studio.
LM Studio Usage and Model Discussions: Users like @bigboimarkus expressed satisfaction with LM Studio for tasks such as proofreading, whereas @mr.stark_ queried about models that learn on the fly. Conversations included the functionality and integration with other tools like Ollama and Automatic1111.
General Community Assistance and Banter: Throughout, community members engaged in sharing tips, offering troubleshooting advice, including suggestions for alternatives or downgrading versions, and occasionally joked about AI capabilities such as predicting lottery numbers.

Links mentioned:

Stable Cascade - a Hugging Face Space by multimodalart: no description found
cmp-nct/Yi-VL-6B-GGUF at main: no description found
TheBloke/CodeLlama-70B-Instruct-GGUF at main: no description found
Chost Machine GIF - Chost Machine Ai - Discover & Share GIFs: Click to view the GIF
System prompt - Pastebin.com: Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
NVIDIA Chat With RTX: Your Personalized AI Chatbot.
The unofficial LMStudio FAQ!: Welcome to the unofficial LMStudio FAQ. Here you will find answers to the most commonly asked questions that we get on the LMStudio Discord. (This FAQ is community managed). LMStudio is a free closed...
The unofficial LMStudio FAQ!: Welcome to the unofficial LMStudio FAQ. Here you will find answers to the most commonly asked questions that we get on the LMStudio Discord. (This FAQ is community managed). LMStudio is a free closed...

LM Studio ▷ #🤖-models-discussion-chat (75 messages🔥🔥):

MedAlpaca for Medical LLMs: User @heyitsyorkie suggested medAlpaca, a model fine-tuned for medical question answering, for @pepito92i's project on LLMs in the medical field.
Phi-2 Model Discussions: @.dochoss333 inquired about the absence of the official "microsoft/phi-2" model in LM Studio, and @heyitsyorkie clarified that it's not a GGUF model and thus won't show up. @hugocapstagiaire_54167 mentioned user TheBloke might have transformed it into a .gguf for usability with llama.cpp.
LLama.cpp and Model Support: @jedd1 puzzled over why some models wouldn't load, and @heyitsyorkie pointed out that the Yi-VL models are unsupported in the current build of llama.cpp, requiring an update for compatibility.
LM Studio Assistant Functionality Inquiry: User @edu0835 inquired about the possibility of creating an assistant in LM Studio with the ability to utilize PDFs or books for a medical assistant application, without a direct response provided at this time.
Model Performance Comparisons Engage Community: Users like @kujila and @heyitsyorkie engaged in comparisons between different language models, with discussions on model specificity, ethical behavior of AI models, and suggestions to try out models like Deepseek Coder Ins 33B.

Links mentioned:

Nexesenex/Senku-70b-iMat.GGUF at main: no description found
Hi Everybody Simpsons GIF - Hi Everybody Simpsons Wave - Discover & Share GIFs: Click to view the GIF
TheBloke/medicine-chat-GGUF · Hugging Face: no description found
wolfram/miquliz-120b-v2.0-GGUF · Hugging Face: no description found
GitHub - kbressem/medAlpaca: LLM finetuned for medical question answering: LLM finetuned for medical question answering. Contribute to kbressem/medAlpaca development by creating an account on GitHub.
The new Yi-VL-6B and 34B multimodals ( inferenced on llama.cpp, results here ) · ggerganov/llama.cpp · Discussion #5092: Well, their benchmarks claim they are almost at GPT4V level, beating everything else by a mile. They also claim that CovVLM is one of the worst (and it's actually the best next to GPT4, by far) On...

LM Studio ▷ #🎛-hardware-discussion (140 messages🔥🔥):

NVLink and Memory Cycling Queries: @luxaplexx asked if GPUs were NVLinked and how memory cycles through in a multi-GPU setup. The consensus, including from @heyitsyorkie, is that they likely aren't NVLinked due to potential CUDA issues, especially with older cards like the 960. Users are considering whether different generations of NVIDIA GPUs like the 1080 and 1060 6g can work together effectively.
Discussions on Upgrading to Better GPUs: Several users, including @crsongbirb and @heyitsyorkie, discussed upgrading their GPUs for improved performance in LLM tasks, with a suggestion to look at the RTX 3060 12GB as a viable option for running LLMs locally.
Risks and Rewards of Overclocking: In a discussion initiated by @alastair9776 about overclocking for better performance, @rugg0064 and @crsongbirb noted that overclocking VRAM/RAM can lead to a notable increase in token generation speed, although caution is advised due to potential hardware stress.
Combining GPUs and Threadripper Dreams: Conversation ensued about the feasibility and costs of using multiple high-performance GPUs, with users like @nink1 and @quickdive. debating if a beefy CPU is necessary when having multiple powerful GPUs, and the logistics of housing such a setup.
CUDA on AMD and Other Hardware Convos: @666siegfried666 shared news about the ZLUDA project allowing CUDA apps to run on AMD hardware and this sparked a brief discussion on the relevance and future potential of such a feature. Users such as @addressofreturnaddress and @joelthebuilder also discussed their own rig setups and potential upgrades, highlighting personal preferences and value assessments.

Links mentioned:

Doja Cat GIF - Doja Cat Star - Discover & Share GIFs: Click to view the GIF
Brexit British GIF - Brexit British Pirate - Discover & Share GIFs: Click to view the GIF
ATOM Echo Smart Speaker Development Kit: ATOM ECHO is a programmable smart speaker.This eps32 AIoT Development Kit has a microphone and speaker for AI voice interaction light and small. It can be access AWS, Baidu, ESPHome and Home Assistant...
Unmodified NVIDIA CUDA apps can now run on AMD GPUs thanks to ZLUDA - VideoCardz.com: ZLUDA enables CUDA apps on ROCm platform, no code changes required AMD-backed ZLUDA project can now enable code written in NVIDIA CUDA to run natively on AMD hardware. AMD has reportedly taken over t...
Lian-Li O11 Dynamic XL ROG certificated -Black color Tempered Glass: Lian Li O11 Dynamic XL ROG certificated, Front and Left Tempered Glass, E-ATX, ATX Full Tower Gaming Computer Case - Black

LM Studio ▷ #🧪-beta-releases-chat (21 messages🔥):

Awaiting the Next Update for IQ3_XSS Support: @n8programs inquired about IQ3_XSS support in the latest release, to which @yagilb responded that it will be included in the next update.
Elevation of 1bit Quantization on the Horizon: @drawless111 shared excitement about upcoming 1.5 bit quantization, posting a GitHub pull request link indicating progress. This elicits reactions with @heyitsyorkie anticipating a sweet next beta with the new quant sizes.
Model Benchmarking Induces Awe: @drawless111 expressed amazement at the latest benchmarks for 1bit quantization, stating “70B model on 16 GB card. WOOF." and pointing out a '70B' model posted on Hugging Face that can offload on VRAM effectively.
Preparations for Incompatible Model Downloads: Users, including @epicureus, are advised to download models like IQ3_XSS even if they're not supported yet, with @fabguy humorously suggesting "Save the model, save the world!"
Hugging Face Hub Features Multiple New Models: @drawless111 shared an update, revealing the availability of 5 IQ1 models on Hugging Face that work with various VRAM sizes, nonchalantly noting an increase to 10 by the end of the conversation.

Links mentioned:

Nexesenex/NousResearch_Yarn-Llama-2-70b-32k-iMat.GGUF · Hugging Face: no description found
Claire Bennet Heroes GIF - Claire Bennet Heroes Smile - Discover & Share GIFs: Click to view the GIF
1.5 bit quantization by ikawrakow · Pull Request #5453 · ggerganov/llama.cpp: This draft PR is a WIP that demonstrates 1.5 bits-per-weight (bpw) quantization. Only CUDA works, there is no implementation for the other supported back-ends. CUDA, AVX2 and ARM_NEON are implement...

LM Studio ▷ #avx-beta (2 messages):

No AVX2, No Cry: @rafalsebastian expressed concerns about not being able to run LMstudio on CPUs with only AVX (version one) after getting the message that their processor doesn't support AVX2. They wondered if they should switch machines for running local LLMs.
LM Studio Beta for the Rescue: @heyitsyorkie responded with a solution, mentioning that LM Studio can indeed run on CPUs with only AVX support by downloading the 0.2.10 AVX beta release for Windows. They also recommended upgrading to a CPU with AVX2 for optimal results and provided a link to beta releases and terms of use.

Links mentioned:

LM Studio Beta Releases: no description found

LM Studio ▷ #crew-ai (3 messages):

Looking for Dual Model Deployment Tips: @alluring_seahorse_04960 wonders how to run two models on the same machine without facing repetition errors. The user mentions using a Conda environment on Ubuntu and avoids VMs for their slowness.
Humorous Clarification Request on Repetition: In response to @alluring_seahorse_04960, @wolfspyre jokes about the nature of the repetition errors, questioning whether they pertain to looping outputs or tasking issues within worker processes.

LAION ▷ #general (361 messages🔥🔥):

Magvit V2 Reproduction Inquiries: @.lostneko sought technical guidance for reproducing Magvit V2. Discussions circled around the ideal datasets and parameters for video compression and understanding, with @chad_in_the_house mentioning experiments on the lfq side of Magvit2.
Mysterious Buzz around Magvit: @pseudoterminalx and others in the chat noticed sudden interest in MAGVIT, speculating about a recent influencer mention given the two mentions within a short time frame.
Stable Cascade Discussions Heat Up: Focus shifted to Stability AI's Stable Cascade model, with dialogues highlighting its hefty VRAM requirements, misleading inference time graphs, and concerns about the model being poorly optimized and full of bugs. @pseudoterminalx shared examples of its capabilities, including issues with text clarity in image outputs.
Evaluating AI Models and Copyright Concerns: Conversations touched on the usage and legality of AI-generated images. Users @vrus0188 and @kenjiqq debated AI image model copyrights, commercial use, and the implications of research-only model licenses.
Hardware and Performance Perspectives: A technical dialogue ensued over Stable Cascade's heavy VRAM use and optimization problems, as @pseudoterminalx reported issues like inability to run models in float16 and @kenjiqq provided details about inference time on consumer GPUs like the 3090.

Links mentioned:

Stable Cascade - a Hugging Face Space by multimodalart: no description found
Court Dismisses Authors’ Copyright Infringement Claims Against OpenAI * TorrentFreak: no description found
Stable Cascade のご紹介 — Stability AI Japan — Stability AI Japan: Stable Cascade の研究プレビューが開始されました。この革新的なテキストから画像へのモデルは、品質、柔軟性、微調整、効率性の新しいベンチマークを設定し、ハードウェアの障壁をさらに排除することに重点を置いた、興味深い3段階のアプローチを導入しています。
Hey Hindi GIF - Hey Hindi Bollywood - Discover & Share GIFs: Click to view the GIF
Don't ask to ask, just ask: no description found
GitHub - Stability-AI/StableCascade: Contribute to Stability-AI/StableCascade development by creating an account on GitHub.
Crypto Kids Poster | 24posters | Hip Hop & Street Art Prints: Transform your walls with our viral new Crypto Kids Poster. Inspired by street-wear & hip hop culture, enjoy artwork designed to bring you bedroom to life. Fast shipping times (3-5 days) 10,000+ h...

LAION ▷ #research (48 messages🔥):

Discussion on Impact of Adult Content on AI: @vrus0188 and others discuss the historical contributions of adult content to advancing technology, juxtaposing it with AI developments. Some users like @twoabove acknowledge the pattern of adult industries driving tech advancements, while others like @SegmentationFault doubt if the focus on adult content leads to meaningful progress in AI.
Concern Over Explicit AI-Generated Content: @thejonasbrothers shares a news article highlighting the misuse of AI in creating non-consensual pornography, noting the challenges it poses and its high visibility. This leads to a discussion on the broader implications and controversies surrounding AI's use in adult content.
Observations on the Pornography Market and AI: Users like @chad_in_the_house and @freon discuss the profitability and market saturation of NSFW content, contemplating the economical and ethical risks involved in this space.
Debates Over the Merits of AI-Powered Erotic Roleplay: @SegmentationFault expresses frustration over the preference for low-effort erotic content in AI communities, arguing that this hinders meaningful developments in AI models. Others like @mfcool and @.undeleted echo these sentiments, criticizing the quality stagnation in AI-generated adult imagery.
Technical Discussion on AI Image Quality: @drhead delves into technical aspects of AI-generated images, mentioning the NovelAI model and discussing the viability and impact of VAE encoder training for improved image generation. There is a communal reflection on the standards of "photorealism" within the community and how they could be improved.

Links mentioned:

Reddit - Dive into anything: no description found
Reddit - Dive into anything: no description found
AI brings deepfake pornography to the masses, as Canadian laws play catch-up: Underage Canadian high school girls are targeted using AI to create fake explicit photos that spread online. Google searches bring up multiple free websites capable of "undressing" women in ...

Eleuther ▷ #general (179 messages🔥🔥):

Checksums for The Pile Data Located: @paganpegasus provided @hailey_schoelkopf with the checksums for The Pile zst shards, linking both the Discord pins and the EleutherAI's hashes.
Tools to Determine Image Content: @everlasting_gomjabbar inquired about tools to discern if an image is of an object/location versus 'nothing' like a blurry shot. @paganpegasus described the complexity of defining "nothing" in images, while @rallio. recommended using models like OwlViT or CLIP.
Manuscript Review and Editing in Progress: @wonkothesensible, through a series of messages, provided meticulous feedback on a paper draft provisionally titled "Don't think about the paper", focusing on clarifying language and grammar. @hailey_schoelkopf expressed gratitude and indicated credits to the EleutherAI Discord in the paper's acknowledgements.
Cloud Resources for NLP Classification Discussed: In response to @pxxxl seeking advice on cloud resources for training NLP classification models, @ad8e recommended GCP and Colab, with various participants chiming in about the costs and features of various platforms like runpod and vast.ai.
Inquiries About EleutherAI Computing Resources: User @vidava asked about the guidelines and requirements for accessing EleutherAI's computational resources for a semi-custom LLM project featuring architectural adjustments and fine-tuning adapters. @stellaathena indicated openness to collaboration but highlighted the need for clarity on the research agenda and proposed a collaborative value proposition.
Semantic Scholar Paper-Author Linking Mechanism: Regarding whether Semantic Scholar automatically links Arxiv papers to authors, _inox clarified that the process is automatic but allows for manual intervention or suggested changes if errors occur.

Links mentioned:

Overleaf, Online LaTeX Editor: An online LaTeX editor that’s easy to use. No installation, real-time collaboration, version control, hundreds of LaTeX templates, and more.
Research Paper Release Checklist : no description found
lora_example.py: lora_example.py. GitHub Gist: instantly share code, notes, and snippets.
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Hashes — EleutherAI: no description found

Eleuther ▷ #research (208 messages🔥🔥):

Fractal Analysis of Neural Network Hyperparameters: @jbustter shared visualizations of fractals generated from neural network hyperparameters, with red indicating diverging training and blue for converging. Jascha Sohl-Dickstein's blog post showcases the concept, correlating fractal patterns with the learning rates of network layers and the network's weight offset.
Discussing Convergence and Divergence in Training: The conversation, involving users like @Hawk, @genetyx8, and @mrgonao, discussed the means for determining if neural network training is converging or diverging, debating the presence of "diverging to infinity" and the nature of boundaries within fractal visualizations, with suggestions that NaNs may denote divergence.
Active Learning and Data Presentation Order in ML: @rybchuk inquired about research on models choosing the order of data presentation, leading to a discussion about active learning. @thatspysaspy mentioned the subfield's existence, noting its lack of success, and @catboy_slim_ added that it could halve training requirements by using smaller models to filter data for larger models' training.
Leveraging Unsupervised Data in Encoder-Decoder Models: The question of how to utilize large unsupervised datasets effectively in encoder-decoder models for tasks such as audio to text was brought up by @loubb. Suggestions and discussions ranged from training components separately to integrating cross-attention during pre-training.
Release of an NLP Robustness Paper and Test-Time Augmentation: @millander announced the publication of their lead author paper on improving text classifiers' robustness through test-time augmentation (TTA) using large language models. They thanked the community for support and shared the arxiv link to their work.

Links mentioned:

Neural network training makes beautiful fractals: This blog is intended to be a place to share ideas and results that are too weird, incomplete, or off-topic to turn into an academic paper, but that I think may be important. Let me know what you thin...
A Poster for Neural Circuit Diagrams: As some of you might know, I have been working on neural circuit diagrams over the past year or so. These diagrams solve a lingering challenge in deep learning research – clearly and accurately commun...
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts: State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers. At the same time, Mixture of Experts (MoE) has significantly im...
Scaling Laws for Fine-Grained Mixture of Experts: Mixture of Experts (MoE) models have emerged as a primary solution for reducing the computational cost of Large Language Models. In this work, we analyze their scaling properties, incorporating an exp...
Model Editing with Canonical Examples: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and ...
An Exponential Learning Rate Schedule for Deep Learning: Intriguing empirical evidence exists that deep learning can work well with exoticschedules for varying the learning rate. This paper suggests that the phenomenon may be due to Batch Normalization or B...
Nonlinear computation in deep linear networks: no description found
Feedback Loops With Language Models Drive In-Context Reward Hacking: Language models influence the external world: they query APIs that read and write to web pages, generate content that shapes human behavior, and run system commands as autonomous agents. These interac...
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks: State-space models (SSMs), such as Mamba Gu & Dao (2034), have been proposed as alternatives to Transformer networks in language modeling, by incorporating gating, convolutions, and input-dependen...
Suppressing Pink Elephants with Direct Principle Feedback: Existing methods for controlling language models, such as RLHF and Constitutional AI, involve determining which LLM behaviors are desirable and training them into a language model. However, in many ca...
Improving Black-box Robustness with In-Context Rewriting: Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings ...
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models: Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new...
Tweet from Nature Reviews Physics (@NatRevPhys): Perspective: Generative learning for nonlinear dynamics By @wgilpin0 @TexasScience https://rdcu.be/dysiB
Tweet from Hannes Stärk (@HannesStaerk): Diffusion models are dead - long live joint conditional flow matching! 🙃 Tomorrow @AlexanderTong7 presents his "Improving and generalizing flow-based generative models with minibatch optimal tran...
A weight matrix in a neural network tries to break symmetry and fails.: We initialize a neural network so that the weight matrices can be nearly factorized as the Kronecker product of a random matrix and the matrix where all of t...
Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation: Despite the promise of Mixture of Experts (MoE) models in increasing parameter counts of Transformer models while maintaining training and inference costs, their application carries notable drawbacks....
llm-random/research/conditional/moe_layers/expert_choice.py at ad41b940c3fbf004a1230c1686502fd3a3a79032 · llm-random/llm-random: Contribute to llm-random/llm-random development by creating an account on GitHub.
An Emulator for Fine-Tuning Large Language Models using Small Language Models: Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a pre-training stage that uses a very large, diverse dataset of text and a fine-tuning (sometimes, &#...
MASS: Masked Sequence to Sequence Pre-training for Language Generation: Pre-training and fine-tuning, e.g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks...
Meta- (out-of-context) learning in neural networks: Brown et al. (2020) famously introduced the phenomenon of in-context learning in large language models (LLMs). We establish the existence of a phenomenon we call meta-out-of-context learning (meta-OCL...
Secret Collusion Among Generative AI Agents: Recent capability increases in large language models (LLMs) open up applications in which teams of communicating generative AI agents solve joint tasks. This poses privacy and security challenges conc...
Portal: Home of the TechBio community. Tune into our weekly reading groups (M2D2, LoGG, CARE), read community blogs, and join the discussion forum.
Generative learning for nonlinear dynamics | Nature Reviews Physics: no description found
Policy Improvement using Language Feedback Models: We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following. To trai...
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis: Recent research has highlighted the importance of dataset size in scaling language models. However, large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text...
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units - Meta Research: We present a novel two-pass direct S2ST architecture, UnitY, which first generates textual representations and predicts discrete acoustic units subsequently.

Eleuther ▷ #interpretability-general (1 messages):

In Search of Interpretability Guidance: @jaimerv reached out to the channel asking for a more current overview of approaches to interpretability than the paper they referenced on Representation Engineering. They are seeking assistance for potentially better or newer resources on the topic.

Eleuther ▷ #lm-thunderdome (4 messages):

Contributors Wanted for Hallucinations Leaderboard: @pminervini shared a call to action for contributions to the hallucinations leaderboard, adding that there are several new hallucination-oriented tasks to work on within the Harness leaderboard space.
Enthusiastic Response to Collaboration: Following the announcement, @baber_ expressed interest and asked what specific help was needed.
Call for Specific Assistance: In response, @pminervini mentioned they need help with task definitions, proposing/adding new datasets and metrics, and assistance in determining which results to re-compute following recent updates to the harness.

Eleuther ▷ #gpt-neox-dev (4 messages):

Potential Misalignment in Pythia Deduped Data: @pietrolesci has raised concerns about a possible misalignment between training data batches and checkpoints specifically for the 2.8b size Pythia deduped suite. Other models, including the smaller versions and 6.9b, seem well-aligned.
Response to Data Alignment Query: @hailey_schoelkopf acknowledged @pietrolesci's query about the alignment issue and stated they will follow up on this matter.
Interest in Pythia Research and Suggestion for Publication: @stellaathena expressed excitement about the potential for a blog post or workshop paper demonstrating the reliability of Pythia, which they would extensively cite.
Openness to Writing About Pythia: In response to @stellaathena, @pietrolesci appreciated the suggestion about creating a post regarding their findings on Pythia, considering it a good short project post-ACL deadline.

LlamaIndex ▷ #announcements (2 messages):

LlamaIndex v0.10 Released: @jerryjliu0 announced the release of LlamaIndex v0.10, which is the most significant update to date, featuring a new llama-index-core package and splitting integrations/templates into separate PyPi packages. The llamahub.ai is also being revamped, they've deprecated ServiceContext for better developer experience, and encourage the community to explore the blog post and documentation for detailed info on migration and contributing.
Celebrating Team Achievement: Big thanks were given to <@334536717648265216> and <@908844510807728140> for leading the effort on the latest LlamaIndex update, which is a step towards making it a production-ready data framework.
Tweet about LlamaIndex v0.10 Launch: LlamaIndex shared a tweet highlighting key updates in LlamaIndex v0.10, including the creation of hundreds of separate PyPi packages, the refactoring of LlamaHub, and the deprecation of ServiceContext.
Webinar Announcement with No-Code RAG Tutorial: Flowise's co-founder, Henry Heng, will feature in a LlamaIndex Webinar to demonstrate building no-code Retrieve and Generate (RAG) applications using their new integration with LlamaIndex.TS. The webinar is scheduled for Friday 9am PT and interested individuals can register here.

Links mentioned:

LlamaIndex Webinar: Build No-Code RAG · Zoom · Luma: Flowise is one of the leading no-code tools for building LLM-powered workflows. Instead of learning how to code in a framework / programming language, users can drag and drop the components...
Tweet from LlamaIndex 🦙 (@llama_index): 💫 LlamaIndex v0.10 💫 - our biggest open-source release to date, and a massive step towards production-readiness. 🚀 ✅ Create a core package, split off every integration/template into separate PyPi ...
LlamaIndex v0.10: Today we’re excited to launch LlamaIndex v0.10.0. It is by far the biggest update to our Python package to date (see this gargantuan PR)…

LlamaIndex ▷ #blog (5 messages):

LlamaIndex Hits v0.10 Milestone: LlamaIndex announces its biggest open-source release, v0.10, signaling a shift towards production-readiness. A core package has been created and hundreds of integrations split off into separate PyPi packages as highlighted in their Twitter post.
Tutorial on Multimodal Apps with LlamaIndex: @ollama and LlamaIndex co-present a tutorial for building context-augmented multimodal applications on a MacBook, including smart receipt reading and product image augmentation, shared via this tweet.
DanswerAI Enhances Enterprise with LlamaIndex: DanswerAI leverages @llama_index to offer ChatGPT functionalities over enterprise knowledge bases, integrating with common workplace tools such as GDrive, Slack, and Jira to boost team efficiency as announced in the Twitter announcement.
Upcoming No-Code RAG Webinar with FlowiseAI: @llama_index teams up with @FlowiseAI for a webinar on building no-code RAG (Retrieval-Augmented Generation) workflows with LlamaIndex.TS and Flowise, details in their recent tweet.
Define Research Workflow with RAG-powered Agent: A notebook by @quantoceanli outlines a process to establish a scientific research workflow, harnessing LlamaIndex to operate with resources like ArXiv and Wikipedia for an innovative RAG-powered agent, showcased in this tweet.

LlamaIndex ▷ #general (303 messages🔥🔥):

LlamaIndex Import Troubles: Users like @ddashed, @bhrdwj, @lhc1921, and @cheesyfishes discuss issues with the latest LlamaIndex update. Users were advised to start with a fresh venv or container and pointed towards a migration guide and package registry for reference.
Complex Document Filtering Challenges: User @_shrigmamale sought assistance in filtering large directories of complex documents based on keywords, dates, and file types. Another user, @qingsongyao, suggested traditional indexing techniques over expensive LLMs like GPT-4 for dynamic file filtering.
Efficient Handling of Multiple Document Sources: Users like @nvmm_, @whitefang_jr, and @.saitej engaged in discussions about handling and merging private user-uploaded documents with public indexed documents using LlamaIndex and the potential for creating multiple agents for individual documents.
Configuring Chunk Sizes and Testing Performance: @sgaseretto asked about where to specify chunk_size now that ServiceContext is deprecated in favor of Settings. @cheesyfishes provided the new way to configure chunk size globally or by passing the node parser/text splitter into the index.
Handling Changes with Chat Memory Buffer: @benzen.vn inquired about experiencing non-relevant responses when using a ChatMemoryBuffer. @whitefang_jr suggested that off-topic conversations might degrade the relevancy of queries and pointed to parts of the LlamaIndex source code for explanation.

Links mentioned:

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team
Response Modes - LlamaIndex 🦙 v0.10.3: no description found
Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team
Google Colaboratory: no description found
Build a chatbot with custom data sources, powered by LlamaIndex: Augment any LLM with your own data in 43 lines of code!
Router Query Engine - LlamaIndex 🦙 v0.10.3: no description found
Elasticsearch Vector Store - LlamaIndex 🦙 v0.10.3: no description found
llama_index/llama-index-legacy/llama_index/legacy/vector_stores/mongodb.py at main · run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
llama_index/llama-index-core/llama_index/core/chat_engine/condense_question.py at 3823389e3f91cab47b72e2cc2814826db9f98e32 · run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
Usage Pattern - LlamaIndex 🦙 v0.10.3: no description found
Node Postprocessor Modules - LlamaIndex 🦙 v0.10.3: no description found
llama_index/llama-index-core/llama_index/core/indices/base.py at 5d557cb2fe48b90e4056ecae25b9371681752a3c · run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications - run-llama/llama_index
Configuring Settings - LlamaIndex 🦙 v0.10.3: no description found
Migrating from ServiceContext to Settings - LlamaIndex 🦙 v0.10.3: no description found

LlamaIndex ▷ #ai-discussion (1 messages):

Super-Easy Full Stack RAG App Building Guide Released: @kerinin has shared an article about building a Retrieval-Augmented Generation (RAG) application using Dewy, a new open-source knowledge base. The guide entails using NextJS, OpenAI API, and Dewy to create a RAG application that improves the accuracy of language model responses by grounding them in specific, reliable information. Read the guide.

Links mentioned:

Building a RAG chatbot with NextJS, OpenAI & Dewy | Dewy: This guide will walk you through building a RAG application using NextJS for the web framework, the OpenAI API for the language model, and Dewy as your knowledge base.

HuggingFace ▷ #announcements (1 messages):

Hugging Face Launches Message API: 🚀 Hugging Face introduces a new Message API compatible with OpenAI, enabling the use of OpenAI client libraries or third-party tools directly with Hugging Face Inference Endpoints and Text Generation Inference. Learn more from their announcement here.
New Open Source Releases and Features: 🤗 Datatrove goes live on PyPI, Gradio updates to 4.18.0 with an improved ChatInterface and more, and there's a launch of Remove Background Web for in-browser background removal. Additionally, Nanotron for 3D parallelism training and new features in Hugging Face Competitions were announced. Accelerate 0.27.0 was released, boasting a PyTorch-native pipeline-parallel inference framework.
Product Innovations at Hugging Face: HF introduces LoRA Studio with a dedicated UI on the Hub, incorporates 2FA support, releases a Mask Generation task page, and announces the arrival of models trained with Axolotl.
Partnerships and Learning Resources Expansion: Hugging Face announces a partnership with Codecademy for a new free AI course on transformers and publishes a blog post about SegMoE, which enables model merging on text-to-image models.
Optimizing Model Performance: There's a technique to load pre-trained PyTorch models approximately 2x faster using Accelerate, detailed in a user guide by @RisingSayak.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Releases · gradio-app/gradio: Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work! - gradio-app/gradio
Tweet from Nouamane Tazi (@Nouamanetazi): Super happy to see https://github.com/huggingface/nanotron released today! ❤️ It's been a fun and insightful ride building a library for 3D parallelism training from scratch, and it's crazy t...
Tweet from Zach Mueller (@TheZachMueller): Today is an extra-special release of @huggingface Accelerate! Among other features, this latest version (with collaboration from @PyTorch) integrates a PyTorch-native pipeline-parallel inference fram...
Tweet from Omar Sanseviero (@osanseviero): Over 300 models have been trained with axolotl and shared on the Hub! It's also the cutest icon ever. https://huggingface.co/models?other=axolotl&sort=trending
Tweet from Sayak Paul (@RisingSayak): Why should LLM kids have all the fun from model merging? Why not us, the diffusion kids? Friends from @_segmind open-sourced SegMoE to reduce this gap 🔥 Do MoE style merging on text-to-image model...
Tweet from Sayak Paul (@RisingSayak): 🤗 Accelerate power-user chronicles 👨‍🏫 Here, I show you how to load a pre-trained PyTorch model ~2x faster with Accelerate. The comments in the code snippet should be self-explanatory. But if yo...

HuggingFace ▷ #general (192 messages🔥🔥):

<ul>
  <li><strong>Search Engine Development Struggles</strong>: <code>@spidy___</code> discussed challenges in developing a search engine and extracting keywords with <code>@vipitis</code>, <code>@cubietom</code>, and others. The conversation explored the limitations of NER and alternatives like keyword extraction, TF-IDF, BM25, and the use of spaCy for Part of Speech tagging.</li>
  <li><strong>Hosting and Inferencing Challenges</strong>: Users like <code>@sullynaj</code> and <code>@ram1428</code> enquired about hosting custom models and whether serverless inferencing is available, with pointers to server-less or affordable solutions discussed.</li>
  <li><strong>Tackling Model Scale</strong>: Conversations with users like <code>@zorian_93363</code> and <code>@xacer_</code> revolved around the feasibility and usefulness of running very large models (100B+ parameters) on typical "open source enthusiast" hardware.</li>
  <li><strong>Valentine's Day Vibes</strong>: <code>@not_lain</code> spread love and joy on Valentine's Day, encouraging the community to hug their loved ones.</li>
  <li><strong>Discussion on Running Models Locally</strong>: <code>@aj_0003</code> asked about running machine learning models locally while <code>@pierrunoyt</code> discussed using Hugging Face to clone and run a model.</li>
</ul>

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Stable Cascade - a Hugging Face Space by multimodalart: no description found
Custom architectures with HuggingFace 🤗: no description found
lamm-mit/x-lora · Hugging Face: no description found
jinaai/jina-embeddings-v2-base-code · Hugging Face: no description found
Norm/nougat-latex-base · Hugging Face: no description found
NVIDIA Chat With RTX: Your Personalized AI Chatbot.
Models - Hugging Face: no description found
Hugtrip GIF - Hugtrip - Discover & Share GIFs: Click to view the GIF
Hands-on - Hugging Face Deep RL Course: no description found
Linguistic Features · spaCy Usage Documentation: spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more.
Linguistic Features · spaCy Usage Documentation: spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more.
Hugging Face status ): no description found

HuggingFace ▷ #today-im-learning (9 messages🔥):

Simple Chatbot Development Blueprint: @wilbert.comicho is looking to create a simple chatbot to gather five specific details from a user and send them via email. They are seeking a template to handle database querying, user prompting/saving data and calling an API for email sending.
AutoGen as a Starting Point: @dwb7737 suggested using Microsoft's AutoGen for chatbot development and pointed to GitHub for detailed use cases and Jupyter Notebooks. Additionally, highlighted that OpenAI is preferable to open-source LLMs when utilizing AutoGen.
Starting Small with AutoGen Studio: In a follow-up, @dwb7737 recommends getting to grips with the basics before diving into AutoGen Studio due to possible behavioral discrepancies and bugs, advocating for an understanding of the underlying processes. They provided a link to AutoGen Studio samples.
wilbert.comicho: Confirms they will be checking out the recommended resources.
Video Guide for Ollama Models: @dwb7737 shared a YouTube video as an excellent resource for learning how to use Ollama open source models in conjunction with LangChain and Autogen.
Google Sheets Merge Pitfalls: @lunarflu is engaged in merging two Google Sheets and cautions the importance of handling duplicate records and maintaining unique records to prevent issues.
Creating Transformers with FP8: @neuralink has progressed to mastering 99% of doremi reproduction and have advanced their training with end-to-end FP8 in 3D parallelism.
Switching from AI to Academia: @sardarkhan_ shares their shift from reading about diffusors and transformers to focusing on their upcoming mid-semester exams.

Links mentioned:

autogen/samples/apps/autogen-studio at main · microsoft/autogen: Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ - microsoft/autogen
autogen/notebook at main · microsoft/autogen: Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ - microsoft/autogen
Ollama - Libraries, Vision and Updates: Ollama Libraries: https://ollama.com/blog/python-javascript-librariesOllama Vision models: https://ollama.com/blog/vision-modelsOllama OpenAI API: https://ol...

HuggingFace ▷ #cool-finds (8 messages🔥):

Back to ML After a Hiatus: @charlweed is diving back into Machine Learning by working on a GIMP 3.0 plugin that connects to Automatic1111, currently facing challenges with posting image data for Image2Image functionality via API.
Digging into the Dirt for Energy: @Gordo Stoli shared a research study on Soil Battery, a potential advancement in energy technology.
MoE Security Vulnerabilities Exposed: @osanseviero introduced a paper demonstrating how Mixture of Experts (MoE) models are susceptible to adversarial attacks affecting the outputs of benign queries.
Understanding MoE Risks and Mitigations: @osanseviero also wrote detailed notes on potential mitigation strategies for the vulnerabilities described in the DeepMind paper, suggesting batch order randomization among other methods, available here.
Questions about MoE's Future Stability: @meatfucker highlighted the potential future threat of the reported MoE attack strategy and considered the implications for systems using large batches, which may inadvertently affect output quality.

Links mentioned:

[@osanseviero on Hugging Face: "Mixture of experts: beware 🛡️⚔️

New paper by DeepMind: Buffer Overflow in…"](https://huggingface.co/posts/osanseviero/980907000007376): no description found

Paper page - Buffer Overflow in Mixture of Experts: no description found

HuggingFace ▷ #i-made-this (10 messages🔥):

Quiz Generation Anticipation: @lunarflu suggested the addition of a loading screen or bar for the quiz generation process, mentioning an issue with just waiting for the quiz to appear without any indication.
Automated Image Tagging Model Deployed: @not_lain announced an automated model for tagging images pertinent to diffusion tasks and gave instructions for use, along with a link to their discussion. They also mentioned the model's implementation improvements in refs/pr/2.
Model Supports Various Image Formats: @not_lain highlighted that their tagging model accepts input as a string (path), a PIL image, or a numpy array, showcasing flexibility in handling images.
AI for Anime Data Set: @not_lain expressed intentions to use their image tagging model to annotate an anime dataset, while @imcoza1915 commented on the coolness of the tool.
New "Remix" Mode for Image Transformation: @matthieulc shared an update to PanoramAI.xyz, introducing a "remix" mode with ControlNet technology for better structure preservation in image transformations. Users are reminded they can navigate the tool using arrow keys.
From Sketch to Fashion with AI: @tony_assi unveiled a project Sketch to Fashion Design with great pride, which has received positive feedback as an AI able to understand designs, as @chad_in_the_house implied.

Links mentioned:

panoramai: what's in your world?
Sketch To Fashion Design - a Hugging Face Space by tonyassi: no description found
p1atdev/siglip-tagger-test-3 · Upload folder using huggingface_hub: no description found

HuggingFace ▷ #reading-group (32 messages🔥):

S4 Architecture Gets Annotated: @ericauld shared a resource on "The Annotated S4" asking for feedback and pointing out its usefulness for understanding the S4 architecture, which excels in modeling very long-range sequence tasks. They indicated that reading it may help clarify the model before their upcoming talk on Mamba/S4.
Seeking Clarity on S4 Implementation: @austintb. expressed desire for clarification on the S4 architecture's implementation and computational complexity details. @chad_in_the_house echoed the sentiment, requesting intuitive explanation of concepts and prior work such as the hippo codebase, later suggesting a focus on intuition and coding for ericauld's main talk.
Mamba/S4 Talk Schedule and Content Preferences: @ericauld proposed scheduling the Mamba/S4 talk for Friday at 10am California time and suggested potential content for the primary and secondary (math-focused) sessions based on community feedback.
LangTest Paper Makes Its Debut: @prikfy announced the publication of their LangTest paper in the Software Impacts journal, a tool for testing and augmenting NLP models. The paper and the GitHub repository for LangTest were shared, with @ryzxl contributing further context on its comprehensive testing capabilities and how to get started using the library.

Links mentioned:

Structured State Space Models for Deep Sequence Modeling (Albert Gu, CMU): Date: May 26, 2023(Sorry that the first 2 slides are not recorded, those are motivation slides though.)Abstract: This talk will cover recent deep neural netw...
The Annotated S4: no description found
GitHub - JohnSnowLabs/langtest: Deliver safe & effective language models: Deliver safe & effective language models. Contribute to JohnSnowLabs/langtest development by creating an account on GitHub.
LangTest | Deliver Safe & Effective Models | John Snow Labs: no description found

HuggingFace ▷ #diffusion-discussions (10 messages🔥):

Multi-GPU Training Inquiry: George is looking for advice on adapting the train_text_to_image.py script for multi-GPU usage, mentioning previous experience with nn.DataParallel.
Deployment Options for finetuned models: @lokendra_71926 finetuned the mistarl_7b_gptq model and is seeking recommendations for a library or platform suitable for fast inference deployment.
Success with Stable Cascade: @isidentical asked if anyone achieved good text generation with stable cascade, similar to the examples in the readme and confirmed getting 50% success on arbitrary words with the right prompting strategy.
HuggingFace's Inference Engine Suggestion: @chad_in_the_house suggested that HuggingFace has an inference engine that could potentially serve for llms deployment and also mentioned that the discussion might be more appropriate in another channel.
Terminus Model Anticipation: @pseudoterminalx teased that a new terminus model is still in the development phase.

HuggingFace ▷ #computer-vision (7 messages):

Hierarchical Image Classification Challenge: @cropinky described the issue of hierarchical image classification and advised that the complication level depends on the quality and amount of data. They suggested checking out an ECCV22 paper and related datasets on paperswithcode for further research.
In Search of Gaussian Splats: @aeros93 inquired about resources or pre-trained models for creating Gaussian splats from point clouds or images. No specific resources were provided, but @johko990 redirected the query to another channel that could potentially help.
Quest for Multimodal Project Insights: @joee2711 is working on a multimodal project and sought clarification on the difference between Q-former / MLP connector and if MLP connectors and adapters are the same. They also expressed an interest in connecting with others working on similar projects.
Enhancing Image Retrieval Systems: User @femiloye is developing an image retrieval system akin to person reidentification and is looking for methods to improve match accuracy beyond using model embeddings. They are currently utilizing a custom deit transformer trained with reid loss for this purpose.

HuggingFace ▷ #NLP (4 messages):

Fine-tuning Mistral for Deployment: @lokendra_71926 fine-tuned mistarl_7b_gptq model on custom data and is seeking recommendations for a library or platform for deployment to achieve faster inference.
Language Identification with XLM-R: @_michaelsh inquired about how to extract the language from xlmr after reading a HuggingFace post which explains that XLM-RoBERTa does not require language tensors to understand the language being used.
From Natural Language to Algebraic Representations: @_david_valente_ is looking for research or work that has focused on translating natural language into algebraic representations such as LEAN.
Voice Simulation and Language Transformation with Transformers: @mentrass asked about methods to simulate one's voice and alter the language using transformer models.

HuggingFace ▷ #diffusion-discussions (10 messages🔥):

Multi-GPU Adaptation Inquiry: @George is looking for an easy way to adapt the train_text_to_image.py script for multi-GPU usage, noting past experience with nn.DataParallel.
Deployment Platform for finetuned model: @lokendra_71926 has finetuned the mistarl_7b_gptq model and is inquiring about a library or platform for fast inference deployment. @chad_in_the_house suggests looking at Hugging Face inference engine for LLMs.
Text Generation with Stable Cascade: @isidentical questions whether anyone has been able to achieve text generation with stable cascade as showcased in the model's readme, later confirming a 50% success rate with good prompting.
Inference Optimization Discussion Redirected: @chad_in_the_house points out that discussions regarding inference optimization should move to a different channel titled <#1019883044724822016>.
Anticipation for New Terminus Model: @pseudoterminalx indicates that a new terminus model is currently being developed.

Nous Research AI ▷ #ctx-length-research (3 messages):

DAMO-NLP-SG Releases Vast Long-Context Dataset: @giftedgummybee shared the LongCorpus-2.5B dataset which contains 2.5B tokens collected from various domains for long-context continual pre-training. The dataset's composition is inspired by Long-Data-Collections, and its selection criteria ensures a low n-gram similarity with the training set to exclude QA and Summarization data.
Scaling Models with 'rope' vs 'self-extend': @blackl1ght highlighted that scaling models with 'self-extend' can preserve coherence better than 'rope scaling', even at larger scaling factors, referring to the implementation in llama.cpp.
Ease of 'self-extend' Implementation: @blackl1ght noted the benefits of 'self-extend' including no need for setup, fine-tuning, or extra parameters like those required in the 'gguf configurations' for quants.

Links mentioned:

DAMO-NLP-SG/LongCorpus-2.5B · Datasets at Hugging Face: no description found

Nous Research AI ▷ #off-topic (8 messages🔥):

Discussing LangGraph Agents' Perseverance: @pradeep1148 shared a YouTube video titled "LangGraph Agents Persistence," highlighting that LangGraph agents can be set up to retain their state across interactions.
Gemini's Resistance Frustrates Users: @llmaniac1000 expressed disappointment with Gemini's frequent refusal tendencies, seeking others' experiences with it. @n8programs chimed in, stating it's not amazing and implying GPT-4 outperforms Gemini.
Mark Zuckerberg's Image Transformation: @nonameusr shared a Twitter post suggesting that Zuckerberg has transitioned from villain to savior in the context of AI and VR.
A Touch of Humor with GIFs: @error.pdf reacted to previous discussions using humor by sharing a GIF from Tenor, without providing further commentary or context.

Links mentioned:

Rock Cat Eyebrow Cat GIF - Rock cat Eyebrow cat Meme - Discover & Share GIFs: Click to view the GIF
LangGraph Agents Persistence: When creating LangGraph agents, you can also set them up so that they persist their state. This allows you to do things like interact with an agent multiple ...

Nous Research AI ▷ #interesting-links (17 messages🔥):

Mesmerizing Mandelbrot Beauty Shared: @gabriel_syme posted a stunning visualization of the Mandelbrot set. @_3sphere added that the set's focus on divergence contributes to its sense of complexity and order.
Crowdsourcing AI with 'Marv' Chatbot: @.dvs13 praised a crowdsourcing project and noted ambiguity in the term "prompt." The project involves a chatbot named Marv, which answers questions with sarcasm.
Reka Introduces Multi-Modal AI Models: @metaldragon01 highlighted the launch of Reka Flash, a 21B fast multimodal language model, alongside its smaller counterpart Reka Edge. Reka Flash boasts competitive performance to major models like Gemini Pro and GPT-3.5 and is available in public beta.
Pursuing CUDA Compatibility with AMD: @leontello shared a GitHub project, ZLUDA, which aims to run CUDA on AMD GPUs. Unfortunately, the project is no longer actively pursued as detailed by @adjectiveallison, who quoted the project's lead expressing it's effectively abandoned.
Wavelets Meets Transformers in AI Research: An arXiv paper shared by @euclaise suggests that wavelet transforms could enhance Transformers by capturing both positional and frequency information with linear complexity. The paper details Wavelet Space Attention (WavSpA) and has been tested on the Long Range Arena. Find the paper here.

Links mentioned:

@dvilasuero on Hugging Face: "🤗 Data is better together!Data is essential for training good AI systems.…": no description found
Reka Flash: An Efficient and Capable Multimodal Language Model - Reka AI: Reka Flash is a state-of-the-art 21B model trained entirely from scratch and pushed to its absolute limits. It serves as the “turbo-class” offering in our lineup of models.
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability: Transformer and its variants are fundamental neural architectures in deep learning. Recent works show that learning attention in the Fourier space can improve the long sequence learning capability of ...
GitHub - vosen/ZLUDA: CUDA on AMD GPUs: CUDA on AMD GPUs. Contribute to vosen/ZLUDA development by creating an account on GitHub.
GitHub - acorn-io/rubra: AI Assistants, LLMs and tools made easy: AI Assistants, LLMs and tools made easy. Contribute to acorn-io/rubra development by creating an account on GitHub.

Nous Research AI ▷ #general (180 messages🔥🔥):

New Model Training Begins: @n8programs excitedly shares the start of training a new model, mentioning terms like dachshund, neuralbeagle-dpo, and expressing the process as randomly throwing stuff together genetic algorithm-style.
Playful Banter About Model Merging: @teknium humorously notes the metaphorical alignment between dog breeds and model merging, while @leontello likens the mixing methods to evolutionary strategies, and @n8programs reports a horrifying outcome of his merging experiment.
Typo Alert in Model Card: @everyoneisgross reports a typo in Hugging Face's model card for 70B llama, which was swiftly corrected by @teknium, leading to expressions of congratulations on the model launch.
Quantization Quest: Discussion about post-training quantization methods, with @stellaathena sharing a link to a new quantization method, and @nruaif jokingly looking forward to even lower bit-precision.
AI Activation Additions: A deep dive into activation hacking is mentioned, with @filipvv referencing an external article and @mihai4256 discussing their plans to refine their approach, while @proprietary voices interest in the work.

Links mentioned:

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks: Post-training quantization (PTQ) reduces the memory footprint of LLMs by quantizing their weights to low-precision. In this work, we introduce QuIP#, a weight-only PTQ method that achieves state-of-th...
Tweet from jf (@fejo_11): Mixtral 8x7B: Routing Analysis based on POS tags I conducted a routing analysis using @MistralAI's Mixtral 8x7B model, focusing on Part-of-Speech (POS) tags, diverging from the original methodolo...
NousResearch/Nous-Hermes-2-Llama-2-70B · Hugging Face: no description found
Representation Engineering Mistral-7B an Acid Trip: no description found
OpenAI Researcher Andrej Karpathy Departs: Andrej Karpathy, one of the founding members of OpenAI, has left the company, a spokesperson confirmed. Karpathy, a prominent artificial intelligence researcher, was developing a product he has descri...
Xigmoid: An Approach to Improve the Gating Mechanism of RNN: This work proposes an innovative approach for the gating mechanism of RNN class models. A transfer function is embedded into the original sigmoid to form a new gate function called xigmoid. The purpos...
[missing post]: no description found
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning: The recent surge of generative AI has been fueled by the generative power of diffusion probabilistic models and the scalable capabilities of large language models. Despite their potential, it remains ...
‎Practical AI: Machine Learning, Data Science on Apple Podcasts: ‎Technology · 2024
Steering GPT-2-XL by adding an activation vector: Summary: We demonstrate a new scalable way of interacting with language models: adding certain activation vectors into forward passes.[2] Essentially, we add together combinations of forward passes in...

Nous Research AI ▷ #ask-about-llms (38 messages🔥):

DeepSeekMath Merges into the Conversation: User @yxzwayne inquired about the integration of newly introduced deepseekMath in merging strategies, indicating interest in its application.
Finetuning for Dummies Guide Discovered: @nemoia was searching for straightforward instructions on how to finetune Mistral 7B and create their own datasets and later shared a helpful Medium guide that provides detailed examples and explanations on the process.
Forced FA2 Line Causes Memory Issues: In response to a question about FA2 not being enabled, @bloc97 clarified that the problem was related to an attempt to create a large attn_weights matrix, indicating the line of code causing memory issues can be seen here.
Secondary Options for Coding Models: @natefyi_30842 was looking for a less expensive alternative to GPT-4 for a coding model, and @teknium suggested trying out the deepseek coder, which is hosted by "together."
MIQU Model's Pretraining and SFT Clarified: @teknium explained to @yxzwayne that the MIQU model was first pretrained on the Llama-2 70b and then underwent SFT (Supervised Fine Tuning), focusing specifically on instruction-focused data.

Links mentioned:

Tweet from AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: It's Now Open-Source - Phoronix: no description found
Finetuning Llama 2 and Mistral: A beginner’s guide to finetuning LLMs with QLoRA
Training a causal language model from scratch - Hugging Face NLP Course: no description found
modeling_mistral_yarn.py · NousResearch/Yarn-Mistral-7b-128k at main: no description found

Nous Research AI ▷ #collective-cognition (3 messages):

Project Downfall Due to Chat GPT Update: @adjectiveallison inquired if a project was still active after encountering issues accessing the site. @teknium responded, clarifying that the website broke due to the new Chat GPT update with various modes, leading to the original team being unable to maintain it.
Sympathies for the Broken Project: @adjectiveallison expressed disappointment upon learning that the project was no longer maintained following the complications with the new Chat GPT update.

Mistral ▷ #general (43 messages🔥):

Model Selection Advice for Beginners: Newcomer @nana.wav inquired about the best models to use, and @afriendofmaurice recommended instruct models for chat-GPT-like interactions. @mrdragonfox clarified that instruct models are more focused on instruction following, whereas others are akin to raw autocomplete.
Integration with Visualization Libraries: @carnivore5 asked if anyone had experience integrating Mistral functionalities with GraphViz or similar visualization libraries, leading to a clarification by @mrdragonfox about Mistral's lack of inherent function-calling ability.
Chat vs. Completion Endpoints: @i_am_dom and @mrdragonfox discussed the difference between Mistral's /chat/completion and a wished-for raw /completion endpoint, with most usage currently gravitating towards the chat endpoint.
Internship Struggles with Mistral: @nana.wav shared struggles with learning how to use downloaded models and intentions to fine-tune them, leading @mrdragonfox to advise starting with simpler steps. The conversation included sympathy and reminiscence from others, highlighting the common intern experience with overwhelming tasks.
Mistral API Latency Issues: @justinmann. reported inconsistent latencies when using the Mistral API, with response times varying drastically from under a second to over a minute. @sublimatorniq suggested contacting support for assistance.

Mistral ▷ #models (20 messages🔥):

RAG Guide for Mistral: @ethux shared a helpful guide explaining how Mistral works with RAG (Retrieval-Augmented Generation), including steps on retrieval and generation with examples from Mistral, LangChain, and LlamaIndex.
Debate on LangChain vs. LlamaIndex: @sublimatorniq sparked a discussion on the effectiveness of LangChain vs. LlamaIndex, with @rabdullin expressing skepticism about their use in serious LLM-driven products.
DSPy Advocacy: @mrdragonfox advocated for DSPy as a powerful framework, citing that it uses LLM as a "device" and not a "chat" interface and linked to a Twitter post exemplifying its strength.
Mistral-7b Training Dataset Inquiry: @kushagra_67246 inquired about the datasets on which Mistral-7b is trained, receiving humorous and vague responses indicating a mixed variety of internet sources — from @tom_lrd describing it as "Top secret magic soup" to @gamerboi0129 listing textbooks and Wikipedia among other comprehensive sources.
Clarification on Raw Pretraining Checkpoint: @nofreewill42 asked for an open-sourced checkpoint of the Mistral model right after raw text pretraining, expressing that mistralai/Mistral-7B-v0.1 seemed too interactive to be raw.

Links mentioned:

Basic RAG | Mistral AI Large Language Models: Retrieval-augmented generation (RAG) is an AI framework that synergizes the capabilities of LLMs and information retrieval systems. It's useful to answer questions or generate content leveraging ...
GitHub - stanfordnlp/dspy: DSPy: The framework for programming—not prompting—foundation models: DSPy: The framework for programming—not prompting—foundation models - stanfordnlp/dspy

Mistral ▷ #deployment (46 messages🔥):

Docker Deployment Recommendations: @rusenask suggested checking out the ollama or vllm projects for APIs that can be run through Docker for different use cases.
Quota Troubles in the Cloud: @gridinoc experienced difficulties deploying Mixtral with SkyPilot as AWS, Google Cloud, and Azure either denied quota increases or did not respond to requests.
Alternatives to Self-Hosting: @mrdragonfox discussed options for deployment, suggesting cheaper API offerings such as direct mistral or together.ai, despite the current GPU shortages and quota issues faced by @gridinoc.
AWQ Quantization Hitches with MoE: Multiple users, including @mrdragonfox and @casper_ai, discussed issues with the AWQ quantization method and Mixtral models, with @casper_ai recommending an alternative working repository hosted on Hugging Face.
Success with HuggingFace Deployment: @ethux pointed to an instance of Mixtral deployed on HuggingFace.co/chat, offering an alternate route to those facing cloud service barriers.

Links mentioned:

Deploy with SkyPilot | Mistral AI Large Language Models: SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.
casperhansen/mixtral-instruct-awq · Hugging Face: no description found
TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ · Hugging Face: no description found
HuggingChat: Making the community's best AI chat models available to everyone.
TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ · always getting 0 in output: no description found

Mistral ▷ #finetuning (76 messages🔥🔥):

Fine-Tuning vs RAG Explained: Users in the channel debated the merits of fine-tuning versus using Retrieval-Augmented Generation (RAG), with @rabdullin advising to focus on prompt engineering and @mrdragonfox highlighting the importance of base knowledge in a Large Language Model (LLM) when using RAG. @tom_lrd and @mrdragonfox outlined that RAG acts as middleware to provide relevant context for the LLM and has its own complex underlying processes.
Onboarding the New AI Enthusiast: In response to @1mbc seeking resources for understanding AI core concepts, @mrdragonfox and @tom_lrd provided insights into how RAG and GPTs work and suggested platforms like Medium for further learning. No specific resources were linked.
Chatbot Integration Strategies Shared: The conversation delved into the technicalities of feeding an LLM with personalized data, with @mrdragonfox and @tom_lrd describing how data can be pre-processed and turned into a structured format that enriches the LLM's output, specially when using an LLM as a middleware to process user input.
Clarifying Misconceptions on LLM Data Storage: @mrdragonfox corrected some misconceptions about how an LLM 'learns' from new data, such as the functions of GPTs and the significant complexity behind embedding and search before data becomes usable context for an LLM.
Prompt Versioning Tools Inquiry: @khandelwaal.ankit inquired about tools for prompt versioning during fine-tuning experiments, noting a lack of support for Mistral models in some existing tools like PromptLayer; however, no solutions were specifically endorsed or detailed in the discussion.

Mistral ▷ #showcase (2 messages):

Limits on Code Modification: @ethux expressed skepticism about the possibility of making a certain change, suggesting that it might not be possible without altering some code.

Mistral ▷ #random (15 messages🔥):

French Librarian Seeks Internship Opportunities for Student: User @maeelk, a French librarian, is promoting AI use and looking for an internship for a student studying psychology and AI, referring to the Master's program at the University of Savoie Mont Blanc. Interested parties can reach out for collaboration via [email protected].
Mistral's Fan Quiz: User @akshay_1 challenges @maeelk's Mistral fandom by asking them to list the weights of the 7b model. Another user, @ethux, responds humorously, implying the difficulty of listing such technical details.
Building Audio-Inclusive S2S Models on a Shoestring Budget: @akshay_1 shares a client's request to build an S2S model with a persona, fine-tuned with an audio dataset on a budget of $1,000. Several users, like @ethux and @mrdragonfox, react to the insufficient budget, implying that much more would be required.
The Price of Innovation: @skadeskoten inquires about the competitive budget for creating a specialized S2S model, to which @mrdragonfox responds that the cost greatly depends on the extent of architecture needed.

Links mentioned:

Ergonomie socio-cognitive des systèmes intelligents - Classique et alternance - Ametys Campus - Université Savoie Mont Blanc: no description found

Mistral ▷ #la-plateforme (2 messages):

API Key Confusion for TypingMind: @ingohamm reported issues with using the API key for TypingMind, despite having a subscription and payment method in place. He mentioned that trying after a wait or deleting the API key prompted a message about no active subscription, and questioned the status of his account or subscription.
Seek Support from Mistral: In response to the issue, @sublimatorniq suggested that @ingohamm reach out to [email protected] for assistance with his API key and subscription concerns.

Perplexity AI ▷ #general (149 messages🔥🔥):

Seeking Support for Company Data Issues: User @kitsuiwebster expressed that sending emails for support got no response and preferred not to disclose the data-related issue publicly. Instead, they wished to contact directly for help with a company-related problem.
Debating the Merits of Perplexity vs. Phind: User @ludwig_von_mises_fan opened a discussion about the effectiveness of Phind over Perplexity for coding and general search, while @gooddawg10 and @brknclock1215 defended Perplexity's search capabilities, with no preference for coding.
Experiencing Technical Difficulties with Perplexity: Users @yellephen, @luke_____________, and @chenlieong reported issues with the Perplexity chatbot, such as endless loading for answers and service unavailability; @dima_shliugaev from the team acknowledged the issue and it was confirmed to be back online by @vova_at_pplx_ai.
Model Performance and Usage Discussions: Users shared their experiences with different AI models for tasks such as code debugging (@matheusgnhr), tic-tac-toe (@noremac258), and PDF reading (@reader7904); queries regarding specific model details (@hzpd and @unknownuser787) and API usage (@pilotgfx) were also seen.
Subscription Details and Model Information Inquiry: Users @stocktown and @ewaathescientist sought clarification on trial subscriptions and the renewal of Pro subscriptions, while @voidfulness inquired about token refresh rates and was informed by @me.lk that tokens refresh 24 hours after use.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
‎What Gemini Apps can do and other frequently asked questions: Learn what Gemini can do, how it works, and different ways to get access to it.
More than an OpenAI Wrapper: Perplexity Pivots to Open Source: Perplexity CEO Aravind Srinivas is a big Larry Page fan. However, he thinks he's found a way to compete not only with Google search, but with OpenAI's GPT too.
What is a Google dork query and how to protect yourself?: A Google dork query is a search string using advanced search operators. See how hackers get website data not readily available with it and how to protect from it.
Coupons and discounts: Explore Perplexity's blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
Introducing PPLX Online LLMs : The first-of-its-kind Online LLM API

Perplexity AI ▷ #sharing (13 messages🔥):

Perplexity AI Tackles Tough Test Question: @tbrams was impressed with how quickly Perplexity AI handled a complex question from the "Gemini" paper, a task that Google's Gemini service and OpenAI took longer to address. Details of this successful test are available on the Perplexity AI platform.
Community Contributions and Creations: @twodogseeds gave a shoutout to Perplexity for the pplx shortcut action, which supports their Farm Friend research agent. No further details were shared in the message.
Exploring Diverse Perspectives with Bryan Johnson: @ok.alex shared a link to a summary of Bryan Johnson's perspectives via Perplexity AI, while @brknclock1215 offered an alternative angle for scientific summarization. Links to these summaries are found at Bryan Johnson Summary and Scientific Summary respectively.
Engage with the Alt-D-Feed: @ok.alex invited the community to contribute to an alternative feed/newsletter, suggesting it as a collaborative project to curate together. Interested individuals can like and share this initiative.
Summarizing Documents in Seconds!: @aykbl expressed enthusiasm for Perplexity AI's capability to summarize documents swiftly, emphasizing its speed with a smiley face. The content linked or specificity of documents was not mentioned.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

Perplexity AI ▷ #pplx-api (24 messages🔥):

Custom Search Queries Available: User @me.lk clarified that by using search parameters such as "site:reddit.com OR site:youtube.com" in prompts, one can specify multiple content sources when using the API.
Performance Issues with Online API: @andrewgazelka reported performance problems with pplx-70b-online, but noted that removing the system message in the code seemed to fix the issue.
PPLX API Fails with Nonsensical Responses: @myadmingushwork_52332 raised a concern with the API returning random and nonsensical replies involving a mix of numbers and characters when online searching is required.
Reference Provision Under Development: @dvrshil expressed a desire for Perplexity to provide references in API responses, to which @mares1317 responded, stating that the development team is working on this feature.
No Early Access Program Yet: @icelavaman indicated that early access to new Perplexity features is not available at this moment; announcements for new features will come at a later date.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Pricing: no description found

OpenAI ▷ #annnouncements (1 messages):

ChatGPT Gets a Memory Boost: @abdubs announced a new memory feature for ChatGPT which allows it to remember user preferences and details across conversations, thereby enhancing future interactions. This feature is rolling out to select Free and Plus users, with control options available at ChatGPT Memory Features.
You're the Boss of ChatGPT's Memory: Users have the power to tell ChatGPT what to remember, ask it to recall information, and instruct it to forget things conversationally or through settings. The memory feature can also be turned off completely if preferred.
Memory Feature Rolling Out Gradually: OpenAI is currently deploying the memory upgrade to a limited user base and plans to gather feedback to gauge its usefulness. Further announcements regarding a broader rollout will be made soon.

Links mentioned:

Memory and new controls for ChatGPT: We’re testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. You’re in control of ChatGPT’s memory.

OpenAI ▷ #ai-discussions (83 messages🔥🔥):

Discovering the Secrets of SEO: @spidy___ sought insights on how to autonomously tag webpages with relevant keywords like web crawlers do, finding limitations in NER for keyword extraction. @light.grey.labs advised examining SEO files, as web builders often embed a variety of keywords into these for search relevance.
Seeking Creative Minds for AI Research: @noodles7584, a UK researcher, invited community members to discuss how AI is used in creative processes, offering compensation for the 30-minute discussions.
The Quest for the Ultimate Chatbot: Chat explored the challenges with current chatbots, including the inability of GPT models to meet all individual needs, voiced by @jaicraft. @lumirix and others discussed workarounds, like combining bots or leveraging chatbot integrations with services like Google Docs.
ChatGPT Accused of Laziness: @pigondrugs and others commented on GPT's difficulty retaining context, with growing complaints after context capacity increased. In contrast, @drinkoblog.weebly.com argued that higher context limits reduce perplexity, leading to better performance.
AI Model Rivalry Heats Up: @cassofthenight spotlighted Abacus.AI’s Smaug-72B model outperforming GPT-3.5 and expressed concerns over ChatGPT-4's reluctance to produce complete code snippets, suggesting that the AI dodges detailed scripting in favor of pseudo code.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

OpenAI ▷ #gpt-4-discussions (49 messages🔥):

GPT 4 Turbo Cost Queries Clarified: @jeremy.o directed @koukyo to the OpenAI pricing page for GPT 4 cost details and noted that GPT 4 Turbo is the top/cheapest model, costing 2 cents less than other versions, with similar or slightly worse quality depending on use.
GPT 4 Sometimes Slacks Off?: @rodney.leonardo reported a decrease in GPT 4's intelligence in basic tasks, like summarizing a PDF. Community members including @blckreaper confirmed observations of performance issues, and discussions on the topic are collected in a separate channel: <#1047565374645870743>.
Still Waiting for @mentions: Users including @pax0086 and @ancryp discussed the gradual rollout of the @mention feature in GPT, with @darkninjaforever reminding that OpenAI often does gradual feature rollouts, indicating some users are still awaiting access.
Trying to Push the Boundaries of GPT's Vision: @flokyhuan inquired about using videos for fine-tuning language models and was informed by @solbus that fine-tuning is currently only available for text models, and while the GPT vision feature can describe images from a video, it can't be fine-tuned for specific knowledge like sports rules.
ChatGPT Memory Feature Rollout Progresses: @lumirix confirmed that the ChatGPT's feature for remembering past conversation details is being rolled out to both free and Plus users but noted that it's only available to a small portion of users at this time.

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Processing and narrating a video with GPT's visual capabilities and the TTS API | OpenAI Cookbook: no description found

OpenAI ▷ #prompt-engineering (23 messages🔥):

Prompt Engineering Basics Explained: @eskcanta outlined that good prompt engineering involves using precise language, giving clear instructions to the AI, and careful review of the AI's output. When instructing the AI, focus on what to do instead of what not to do, avoiding conflicting instructions.
AI Text Adventures Streamlined: @drinkoblog.weebly.com advised @stealth2077 to use custom instructions like "Focus on simple storytelling and character dynamics" to keep narratives straightforward and avoid complexity, which the AI tends to default to in text adventures.
Navigating Platform Confusion: @beanz_and_rice humorously attempted to engage with ChatGPT on the Discord server, prompting @toror to respond with amusement at the unsuccessful effort.
API Infrastructure vs. Prompt Engineering: @darthgustav. clarified the difference between prompt engineering and API infrastructure to @kate.yanchenka, suggesting that the latter's queries about automated budget calculations and dynamic data handling were related to software development rather than prompt engineering.

Links mentioned:

no title found: no description found
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

OpenAI ▷ #api-discussions (23 messages🔥):

Newbie Seeks Prompt Engineering Wisdom: @zzavior sought advice for getting started with prompt engineering. @eskcanta provided an extensive guide focusing on using clear and precise language, checking the output, and ensuring not to trigger conflicts with the AI's capabilities or training.
Library Queries for Prompt Engineering: @kate.yanchenka inquired about libraries for prompt engineering to manage budgets, fit dynamic data, and handle AI model fallbacks. @darthgustav. clarified that the topic was more about AI software development than prompt engineering.
Conversation Assistance Request Goes Unnoticed: @beanz_and_rice attempted to initiate an interaction using Discord slash commands but failed, followed by a comedic outcry that prompted a reaction from @toror.
Crafting Lightweight Text Adventures: @stealth2077 asked for tips on making a text adventure less deep and thematic. @drinkoblog.weebly.com suggested using custom instructions to guide the AI towards simpler storytelling.
Joke Generation Confusion: @lisabkk45_48614 requested a joke, but @solbus directed them to use the official ChatGPT website instead of the Discord channel.

Links mentioned:

no title found: no description found
Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

OpenAccess AI Collective (axolotl) ▷ #general (66 messages🔥🔥):

MPS Support Acknowledgment and Clarification: @caseus_ expressed gratitude for MPS support in the axolotl project thanks to a GitHub pull request #1264 by Maxime. Confusion arose about the contributor's Discord identity, and @yamashi (the right Maxime) clarified their involvement, noting the dependency on transformers merging changes and the PyTorch pull request #99272 as crucial for further development.
Tinkering with Yi-34b Training and eBay Finds: Users @le_mess, @cookiesowns, and @c.gato discussed various AI and non-AI topics ranging from slow loss decrease during Yi-34b training to an eBay link for an old tech product.
Exploring Model Adaptation and Enhancements: @yamashi suggested the potential benefits of porting models to Keras for wider hardware support, and @dreamgen and @c.gato discussed error handling and fixes related to Hugging Face checkpoint saving, in light of a pull request #1414 and a related issue #1452.
Queries on Cheapest LLM Endpoint Services: @le_mess inquired about affordable LLM endpoint services with responses pointing to local options like llamacpp, external services such as Together AI, and OpenRouter’s cost-effectiveness. Users mentioned JSON serialization issues with Basten and the need for custom configurations.
Discussion of Various Challenges Using LLMs: Issues like JSON serialization (@dangfutures), challenges with FP32 slowness (@yamashi), and need for additional documentation were discussed providing snapshots of technical hurdles and collaborative problem-solving occurring in the AI community.

Links mentioned:

Together AI: Build gen AI models with Together AI. Benefit from the fastest and most cost-efficient tools and infra. Collaborate with our expert AI team that’s dedicated to your success.
peft/utils/save_and_load.py try to connect to the hub even when HF_HUB_OFFLINE=1 · Issue #1452 · huggingface/peft: System Info peft 0.8.2 axolotl v0.4.0 export HF_DATASETS_OFFLINE=1 export TRANSFORMERS_OFFLINE=1 export HF_HUB_OFFLINE=1 Who can help? No response Information The official example scripts My own mo...
Intel® Optane™ Persistent Memory 300 Series (128GB PMem Module) NMC2XXD128GPS | eBay: no description found
GitHub - triton-inference-server/tensorrtllm_backend: The Triton TensorRT-LLM Backend: The Triton TensorRT-LLM Backend. Contribute to triton-inference-server/tensorrtllm_backend development by creating an account on GitHub.
Add MPS support by maximegmd · Pull Request #1264 · OpenAccess-AI-Collective/axolotl: Description Supports basic training on Mac M series. Motivation and Context It partially solves Mac support. How has this been tested? Ran a train job with lora-mps.yml from start to finish.
Fix breaking change by younesbelkada · Pull Request #1414 · huggingface/peft: Fix a breaking change in the recent release, I made a new PR as I messed up the commit history on the previous PR cc @sayakpaul @pacman100

OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (8 messages🔥):

Converging on Chat Dataset Formats: @dctanner is coordinating with Hugging Face to standardize a chat dataset format named MessagesList, to streamline the various dataset formats emerging for fine-tuning chat models. They shared a link to the MessagesList proposal discussion and suggested creating a GitHub org and dedicated page for documentation.
Naming Conventions Matter: @dctanner emphasized the importance of a universal format name like MessagesList that’s not tied to a specific app like ShareGPT or ChatML, which can be confused with the template rather than the JSON format itself.
Validation Challenges for MessagesList: @faldore acknowledged that although they like the idea of MessagesList, it poses challenges in validation because the concept of a "conversation pair" is not easily described by JSON-schema.
The Ideal MessagesList Schema: @faldore proposed an ideal schema for the MessagesList format that includes optional system messages, tools/functions, source metadata, and a greeting message, ensuring user and assistant messages are paired, and the last message is from the assistant.
Benefits of the Suggested Schema: @faldore advocates for the proposed schema, arguing that it is more manageable, verifiable, and space-efficient, and enforces structured message pairing in datasets.

Links mentioned:

@dctanner on Hugging Face: "As the amount of datasets for fine tuning chat models has grown, there's been…": no description found

OpenAccess AI Collective (axolotl) ▷ #general-help (26 messages🔥):

Tokenization Troubles in Axolotl: User @nafnlaus00 enquired about a method to verify that axolotl is tokenizing as expected. @dreamgen recommended inspecting the tokenizer config in the output directory, while @nanobitz pointed to a debug flag in the axolotl repository.
Transformers Update Might Fix Inferencing Issue: @thierry_lama reported a device error while trying to infer on a trained model using runpod's GPU. @nanobitz suggested that it could be due to an issue with transformers and recommended updating.
Multilingual Capabilities Enhancement Attempt: @sadaisystems asked about improving a model's capabilities in a language other than English, receiving a response from @le_mess that pre-training is necessary for significant improvement beyond what LoRA can offer.
Inferencing with LoRA on the Fly: @wizmak sought a way to add LoRA adapters to a base model in real-time during inferencing, and @nanobitz confirmed that with Hugging Face, you can load the peft model, but was unsure of the command to unload it.
Model Parallelism with DeepSpeed Zero 3: User @mihai4256 sought assistance for a working deepspeed zero 3 config for model parallelism, noting that existing ones from the repo weren't functioning as expected for this particular use case.

Links mentioned:

GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.

OpenAccess AI Collective (axolotl) ▷ #datasets (1 messages):

Duplicate Dilemma in Dataset Finetuning: @_rxavier_ inquired about identifying if a text has been previously used to train a model. They asked about techniques for determining model familiarity with a text, possibly by examining the model's response to an article's introduction.
The Impact of Training Data Overlap: Additionally, @_rxavier_ questioned the implications of finetuning a model using data that may overlap with its pretraining dataset. They pondered the potential negative effects of such overlap on the finetuning process.

OpenAccess AI Collective (axolotl) ▷ #runpod-help (1 messages):

Axoltl RunPod Compatibility with Vast.AI: User @dreamgen successfully used the Axoltl RunPod image on Vast.AI, reporting that it worked out of the box.

LangChain AI ▷ #announcements (1 messages):

LangChain Introduces Journaling App with Memory: @hwchase17 shared an early version of a journaling app that incorporates memory, using the LangChain memory module. The app is in an early stage and feedback is welcomed; it remembers information about users for future interactions, akin to the memory feature announced by OpenAI for ChatGPT today. Test the app here and check out the introductory video.

Links mentioned:

Loom | Free Screen & Video Recording Software: Use Loom to record quick videos of your screen and cam. Explain anything clearly and easily – and skip the meeting. An essential tool for hybrid workplaces.
LangChain Companion - Journal: no description found

LangChain AI ▷ #general (59 messages🔥🔥):

Seeking Assistance on LangChain with Android: User @grindmaster2512 inquired about the integration of LangChain with an Android application, and followed up with @812757029260230658 to seek solutions to this query.
Efficient Chunk Pre-processing for Embeddings: @swastikk asked whether chunk pre-processing (like removing white spaces) is necessary before creating embeddings. @johnny2x2 confirmed that removing superfluous text aids the process, especially with email data.
PDF Parser Search, Alternatives to Adobe API: @dejoma requested recommendations for a PDF parser that can split contextually, expressing dissatisfaction with Adobe API's limitations and seeking effective PDF API alternatives.
Calls to Improve LangChain's Documentation Structure: @b0otable provided feedback to the LangChain team suggesting the improvement in documentation structure by reducing example redundancies and updating syntax to avoid inefficient navigation for users.
Dependency Issues with Pinecone and LangChain: User @segmentationfault. experienced dependency resolution errors when trying to update Pinecone Database to v2 with a LangChain dependency, prompt response and solutions were provided by @jacoblee93, a maintainer of LangChain.

Links mentioned:

How to use function calling with Azure OpenAI Service - Azure OpenAI Service: Learn how to use function calling with the GPT-35-Turbo and GPT-4 models
Pinecone | 🦜️🔗 Langchain: You can use Pinecone vectorstores with LangChain.

LangChain AI ▷ #langserve (8 messages🔥):

Clarification on Langserve Scaling: @kjoth_00356 inquired about scaling Langserve to multiple instances and asked about the difference between hosted Langserve and Langserve. @veryboldbagel hinted at a deployment via hosted Langserve, leading to further clarification from @dachsteinhustler who pointed to Langsmith as part of the solution hosted at Langchain Platform, which is in early testing and might require an invite code.
In Search of NodeJS and Chain Integration: @_mauricemoss is looking for a way to expose a chain from a NodeJS app for use in a RemoteRunnable, but no solution has been provided within these messages.
Disabling Intermediate Steps in Playground: @dachsteinhustler expressed a need to disable the intermediate steps in the Langchain playground to prevent browser crashes caused by large base64 strings, resulting in a workaround that involves using RunnableLambda.
Connection Issues with k8s Cluster App: @ezelanza. described an issue where a connection is refused when attempting to invoke the OpenAI API through a k8s cluster-based application, mentioning that direct invocations to the back end work, but requests from the front end (React) fail, even with curl.

Links mentioned:

LangSmith: no description found

LangChain AI ▷ #share-your-work (1 messages):

Introducing Dewy and RAG with NextJS and OpenAI: @kerinin shared their contribution towards Dewy, an OSS knowledge base, along with a post detailing how to build a full-stack RAG application. The guide includes using NextJS, OpenAI API, and Dewy, aimed to minimize hallucinations and ensure accurate language model responses. Check out the blog post here.

Links mentioned:

LangChain AI ▷ #tutorials (2 messages):

Seeking a Superior PDF Parser: @dejoma is looking for a PDF parser that can split contextually. Expresses discontent with Adobe API due to its low usage cap and lack of 'pay-as-you-go' option; is open to suggestions for robust PDF APIs.
Langchain Calculator Quest: @sougata is building a calculator using Langchain that interprets multiplicative operations as mul(a,b). Requests guidance on how to integrate a custom Python library for calculation with the model's augment function.

DiscoResearch ▷ #general (29 messages🔥):

Inquiry on Argilla Hosting Experience: User @drxd1000 is seeking advice on hosting a server for Argilla that can support multiple users for annotation, but there was no resolution provided in the messages.
Layer Selective Rank Reduction Methodology Discussed: @johannhartmann referenced their own implementation of 'Layer Selective Rank Reduction' to address continual training without forgetting, noting that "They basically figure out the statistically less relevant parts of the layers and use them as lora targets," and considering it more efficient than continual approaches. A related GitHub repository was mentioned but not detailed in the conversation: laserRMT.
Out of Memory Issue with lm-evaluation-harness: @philipmay faced an OOM error evaluating a mixtral model and was advised by @bjoernp to utilize multi-GPU support provided by lm-evaluation-harness, indicating two A100s might resolve the issue.
Search for German Toxicity Eval Dataset: User @sten6633 inquired about a German toxicity evaluation dataset and pondered the utility of translating ToxiGen, a dataset available on Hugging Face for implicit hate speech detection. The dataset mentioned can be found on Hugging Face, but requires agreement for access: ToxiGen.
Novel Computational Technique Teased: User @phantine hinted at a new method excluding MoE, briefly titled "Universes in a bottle" and hinted at a potentially radical claim: "P=NP." A GitHub link associated with @phantine's work was shared, but no specific details regarding the technique were provided: LargeWorldModel/LWM.

Links mentioned:

Google Colaboratory: no description found
skg/toxigen-data · Datasets at Hugging Face: no description found
GitHub - cognitivecomputations/laserRMT: This is our own implementation of 'Layer Selective Rank Reduction': This is our own implementation of 'Layer Selective Rank Reduction' - cognitivecomputations/laserRMT
GitHub - LargeWorldModel/LWM: Contribute to LargeWorldModel/LWM development by creating an account on GitHub.

DiscoResearch ▷ #embedding_dev (5 messages):

BM25 + Query + Rerank Combo Wins: User huunguyen highlighted their effective use of BM25 with additional querying and reranking steps for search purposes, and reported that this method "works pretty good."
Wikipedia in a Nutshell: huunguyen managed to index the entirety of Wikipedia, excluding non-essential content, and compacted the BM25 index into a sleek size of under 3GB.
In Search of BM25 Tools: sebastian.bodza inquired about the specific library huunguyen is using to implement the BM25 algorithm for their search index.

DiscoResearch ▷ #discolm_german (1 messages):

thomasrenkert: Is there an ETA for v2 of the German model? Or for the Mixtral variant?

CUDA MODE ▷ #general (1 messages):

GPU Shuffling for Experimentation: @joseph_en reported successful relocation of the Asus WS motherboard to the miner and is awaiting 16x PCI extenders. They've utilized older GPUs for their experiments and have transitioned the miner's motherboard into the case, noting it handles 7B or 13B quantized models with a single 12G NVIDIA 3060 with ease.

CUDA MODE ▷ #cuda (9 messages🔥):

Cross-Compatibility Quest: @iron_bound kicks off a discussion about achieving binary compatibility for CUDA to run on HIP/ROCm platforms, referencing a Phoronix article on Radeon CUDA - ZLUDA.
CUDA for AMD GPUs? Meet ZLUDA: @muhtasham shares a GitHub link to ZLUDA, a project that aims to make CUDA run on AMD GPUs, sparking interest and a request for user experiences by @marksaroufim.
Emoji Enthusiasm: @muhtasham invokes the spirits of the tech world through well-selected emojis of Jensen Huang and Lisa Su.
Market Monopolies and AGI Speculations: @andreaskoepf humorously suggests that Microsoft's purchasing strategy and a borked chip market could leave antitrust agencies unequipped against an AGI future.
Real-World Radeon Trials: _tvi_ shares their experience with Radeon VII and a Ryzen APU, including struggles with dynamic memory allocation causing kernel crashes when handling large PyTorch data chunks.

Links mentioned:

Tweet from [Phoronix] AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: It's Now Open-Source Image (Radeon Cuda 1): no description found
Tweet from AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: It's Now Open-Source - Phoronix: no description found
GitHub - vosen/ZLUDA: CUDA on AMD GPUs: CUDA on AMD GPUs. Contribute to vosen/ZLUDA development by creating an account on GitHub.

CUDA MODE ▷ #algorithms (3 messages):

Multidimensional Gated Recurrences Have Limitations: User @euclaise mentioned a constraint in multidimensional gated recurrences, stating that they require a DxCxN attention matrix which is quite prohibitive in cost, even with a small value for C.
Beyond Simple Linear Recurrences: @euclaise pointed out that prefix-sum-like scans have applications beyond computing simple linear recurrences, opening up a broader range of computational possibilities.
Twitter Insights on Computational Techniques: @euclaise shared insights on computational methods, including the use of maximal scans for sequences (y[t]=max(y[t-1], x[t])), by providing links to their Twitter posts: Tweet on computational methods and Tweet on maximal scans.

CUDA MODE ▷ #jobs (3 messages):

Generative AI Startup Hiring in Hyderabad: @gradman33 shared a job opportunity at an early stage Deep Tech Generative AI startup in Hyderabad, India, seeking talents in ML/Data/Research/SDE. Interested candidates can apply here.
Potential Spam Alert in Jobs Channel: @pudding0377 flagged a post by @gradman33 as possibly irrelevant or spam, calling for the attention of moderators.

Links mentioned:

no title found: no description found

CUDA MODE ▷ #beginner (9 messages🔥):

New Member Alert: @cs_os_05101 mentioned that they have a 4060 Ti.
Search for Engaging CUDA Books: @euclaise inquired about fun books related to CUDA, sparking a conversation about educational resources.
Shader Book Recommendation: @marksaroufim shared The Book of Shaders, a gentle guide to Fragment Shaders, as a possible fun read on a topic adjacent to CUDA.
Understanding User Expertise: After citing familiarity with shader programming, @euclaise clarified they're looking for materials directly related to compute shaders or CUDA, rather than frag shaders.
Looking for Fun in Learning: Both @marksaroufim and @euclaise concurred that defining literature as "fun" can be subjective, but @marksaroufim suggested PMPP as the best educational resource on CUDA, albeit not necessarily fitting the "fun" criterion.

Links mentioned:

The Book of Shaders: Gentle step-by-step guide through the abstract and complex universe of Fragment Shaders.

CUDA MODE ▷ #pmpp-book (7 messages):

Matrix Transposition Debate: @eporat asked if transposing one matrix in a multiplication could lead to fewer cache misses and thus faster computation. @andreaskoepf responded, advising that while sequential memory access could be advantageous, the benefits might be negligible compared to tiled access.
Practical Test Yields No Benefits: Responding to the query about transposing matrices to speed up multiplication, @jeremyhoward recounted his experience stating that transposing during tile creation had no observable effect on performance.
In-Depth Discussion on Transposition: @eporat clarified that an inplace transpose isn’t necessary; sometimes, one only needs to adjust indice ordering in the inner loop, suggesting an alternative to transposition.
Further Clarification Sought: @andreaskoepf questioned @eporat's suggestion, implying that matrix elements are read transposed by default during multiplication, indicating a misunderstanding or need for further explanation on what @eporat meant by adjusting loop indices.

CUDA MODE ▷ #smol-hw (1 messages):

Apple Silicon gets its own 'top': User @marksaroufim shared a link to asitop, a performance monitoring CLI tool for Apple Silicon. It was compared to existing tools like top or nvtop, tailored specifically for Apple's custom chips.

Links mentioned:

GitHub - tlkh/asitop: Perf monitoring CLI tool for Apple Silicon: Perf monitoring CLI tool for Apple Silicon. Contribute to tlkh/asitop development by creating an account on GitHub.

Latent Space ▷ #ai-general-chat (24 messages🔥):

Reka Model Announcement: @swyxio shared a link to a tweet about a new Reka model, creating a buzz in the community. The tweet can be found here.
Favorite VC Podcast Meets AI: @swyxio expressed enthusiasm for a VC podcast discussing AI topics, providing a link to the episode and highlighting its relevance to the community.
Exploring the BUD-E Voice Assistant by LAION: @swyxio discussed a new fully open voice assistant named BUD-E, developed by LAION, which is aimed to improve conversational experiences by being empathetic and context-aware. Details are available on the LAION blog.
What is an Agent?: In search of a definition for "agents," @kaycebasques asked the community for insights. @slono described them as programs that aim to achieve goals with minimal user input.
Karpathy Leaves OpenAI: @nembal spotlighted news from The Information about Andrej Karpathy's departure from OpenAI, stirring curiosity about the implications for the AI field. Background on the development of an AI product for automating tasks mentioned by @slono vaguely referenced AGI as a possible factor in the context of the departure.

Links mentioned:

BUD-E: Enhancing AI Voice Assistants’ Conversational Quality, Naturalness and Empathy | LAION: <p>AI voice assistants have revolutionized our interaction with technology, answering queries, performing tasks, and making life easier. However, the stilted...
OpenAI Researcher Andrej Karpathy Departs: Andrej Karpathy, one of the founding members of OpenAI, has left the company, a spokesperson confirmed. Karpathy, a prominent artificial intelligence researcher, was developing a product he has descri...
President and Co-Founder Anthropic, Daniela Amodei: AI Hurricane — Grit — Overcast: no description found
Memory and new controls for ChatGPT: We’re testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. You’re in control of ChatGPT’s memory.
How Graph Neural Networks Are Transforming Industries: 🔑 Get your AssemblyAI API key here: https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_marco_1Graph Neural Networks (GNN) ha...
Tweet from Joanne Jang (@joannejang): 📝 we just launched a small experiment for memory on ChatGPT. how it works - it's quite similar to custom instructions, except chatgpt is the one driving it (like auto vs. stick shift!) - basical...
GitHub - Stability-AI/StableCascade: Contribute to Stability-AI/StableCascade development by creating an account on GitHub.
sta - Overview: sta has 2 repositories available. Follow their code on GitHub.

LLM Perf Enthusiasts AI ▷ #opensource (2 messages):

Choosing the Right Mistral Model Size: @potrock asked about the appropriate Mistral model size to run locally on an M2 Max with 32GB, seeking community input.
Safe Model Sizing Advice: @natureplayer suggested that 4GB is a safe size for local execution on the mentioned hardware, while 8GB will not work, and 5GB might be possible but is not guaranteed.

LLM Perf Enthusiasts AI ▷ #openai (4 messages):

GPT-5 Speculation Quelled: User @res6969 humorously noted that the rumors of GPT-5 have been greatly exaggerated, indicating skepticism about its existence or imminent release.
Laughter is the Best Medicine?: Both @res6969 and @potrock shared lighthearted reactions with custom emoji and laughing-to-tears emoji, respectively, contributing to a jovial environment on the topic at hand.
A Memory Upgrade for ChatGPT: @potrock shared a blog post discussing new memory features being tested in ChatGPT that allow the model to remember user preferences and details across conversations, which users can manage conversationally or through settings.

Links mentioned:

Memory and new controls for ChatGPT: We’re testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. You’re in control of ChatGPT’s memory.

AI Engineer Foundation ▷ #events (6 messages):

Weekly Meeting Kick-Off with a Sense of Humor: @._z announced the start of the weekly meeting with a playful note: 😄 Déjà vu.
Meeting Attendance Update: @juanreds informed they could not attend the weekly meeting, apologizing to the team.
Invitation to Co-host an AI Hackathon: @caramelchameleon asked if anyone is interested in co-hosting an AI developers hackathon, hinting at a collaboration opportunity with game developers before GDC this year.
Chance to Join Hackathon Online or Onsite: @caramelchameleon mentioned the possibility of attending the hackathon both online and onsite in San Francisco.
Eager Organizer Jumps In: @yikesawjeez expressed interest and requested to be contacted as they specialize in organizing hackathons, especially those associated with events in the Bay Area.

Skunkworks AI ▷ #general (2 messages):

Private Message Prompt: User @bondconnery requests a direct message with a simple "<@1117586410774470818> DM sir".
LLaVA Framework Inquiry: @CodeMan is seeking insights or experiences on integrating LLaVA with an SGLang server and SGLang worker, as opposed to using a standard model worker.