> We checked **19** guilds, **290** channels, and **4378** messages for you. Estimated reading time saved (at 200wpm): **377 minutes**.

Sama at Davos:

  • Altman said his top priority right now is launching the new model, likely to be called GPT-5.
  • Surprisingly, Altman admitted that he ā€œisn’t sure on the exact statusā€ of Sutskever’s employment.

Separately, Itamar from Codium coined Flow Engineering with AlphaCodium, picked up by Karpathy.

—

Table of Contents

[TOC]

TheBloke Discord Summary

  • Swiss Army AI Dreamed Up: Engineers on the server discussed crafting a multi-specialty MOE model that combines seven distinct 7 billion parameter models, each specializing in areas such as law, finance, and medicine, in response to @cos2722’s proposal.

  • 8-Bit Fine-Tuning Debate: @netrve and @that_one_short_guy deliberated over the necessity of 8-bit optimizers for fine-tuning, with the latter suggesting ensuring bitsandbytes is installed with GPU support for optimal functioning.

  • Addressing Channel Spam Decisively: An abusive spamming user was promptly banned from the community, reflecting swift moderation actions.

  • Model Merging Dialogues: The discussion revolved around merging models with shared architecture, with practical advice provided, such as using Mergekit or ensuring models are in Alpaca format for broader compatibility.

  • AMD Optimization for AI Models: There is interest in testing the performance of AI models on AMD systems, specifically using AMD AOCL blas and lapack libraries with llama.cpp, to improve efficiency via AVX512 registers.

TheBloke Channel Summaries

ā–· #general (1151 messagesšŸ”„šŸ”„šŸ”„):

  • AI for Command Line Tasks: @stoop poops shared an experiment where they gave AI access to non-interactive bash terminal shell commands to observe its actions. The AI, referred to as ā€˜tewi’, was able to perform actions like cat /etc/shadow, nmap the router, and even use ssh-keygen.

  • Mixtral’s Coding Capabilities: @rombodawg has been refining prompts for a Mixtral MoE model with the goal of making it surpass a 33b model in human evaluation, aiming for 13b parameter speed with better coding ability than GPT-3.5.

  • LLM for Autonomous Tasks: @selea asked if anyone has successfully used a coding AI for tasks like writing website parsers or scripting game mob behaviors without human supervision, hinting at the possibility with enough examples and fine tuning.

  • Embedded Systems and Vector Stores in LLMs: @iukea discussed the potential for AI enhanced by vector stores and embeddings, comparing GPT-4’s depth of knowledge to other models and the implications of using big models for practical applications.

  • Performance Comparisons Amongst LLMs: Various users including @giftedgummybee, @iukea, and @natepdx compared LLMs like GPT-3.5, Mixtral, Gemini Pro, and GPT-4, discussing their strengths in depth of knowledge, problem-solving ability, response quality, and speed, especially in context of code-related tasks.

Links mentioned:

ā–· #characters-roleplay-stories (425 messagesšŸ”„šŸ”„šŸ”„):

  • Centralized Settings for LLMs: @firepin123 proposed the creation of a centralized platform for open-source frontends’ settings, similar to Hugging Face, with a voting system to streamline the use of LLMs by standardizing settings, improving user experience, and aiding in the debugging and benchmarking processes.
  • Discussion on Fine-Tuning Techniques: @c.gato and others discussed DPO and fine-tuning techniques for LLMs, especially regarding @c.gato’s model, Thespis-13b. @jondurbin recommended using rmsprop instead of adam for DPO and watching for signs of overly aggressive learning rates.
  • RP Character Cards by Model Creators: @stoop poops and @c.gato discussed the potential benefits of model creators including default character cards, with the former expressing a preference for ā€œnormal-ishā€ content, excluding ERP cards due to content sensitivity.
  • Exploring LLMs for Roleplay: @netrve shared positive experiences using Doctor’s Nous-Capybara LimaRP based on Yi-32B and expressed curiosity about using DPO on it, while lamenting the high cost of fine-tuning models like WinterGoddess.
  • Settings Importance and Documentation: Several users, including @theyallchoppable, @doctorshotgun, and @keyboardking, discussed the importance of correct settings to obtain optimal performance from models and the need for better documentation and community-driven recommendations.

Links mentioned:

ā–· #training-and-fine-tuning (29 messagesšŸ”„):

  • The Vision of a Super-Swiss-AI-Knife: @cos2722 proposed the idea of creating a multi-specialty MOE model that acts like a Swiss Army knife by combining the best 7 billion parameter specialized models. This model would address various complex requests by incorporating models such as DeepSeek7b, Open Chat 0106, Medicine Chat, Finance Chat, Law Chat, and three others of choice.

  • 8-Bit Optimizers for Fine-Tuning: @netrve received a warning about bitsandbytes being compiled without GPU support and sought clarification on the importance of 8-bit support for fine-tuning. @that_one_short_guy clarified that 8-bit is rarely used for finetuning and recommended installing bitsandbytes with GPU support.

  • Quick Ban Hammer Strikes: @mrdragonfox swiftly banned an abusive user, as confirmed by @netrve, who noticed the user had spammed other channels as well.

  • The Finetuning Dilemmas of Medically Minded MLX: @cogbuji shared challenges with instruction fine-tuning using MLX on a medical dataset, which resulted in nonsensical outputs. They contemplated switching to a self-supervised approach instead of their current supervised instruction fine-tuning method.

  • Bagel Model Training, Not So Delicious: @jondurbin shared a loss chart of their Bagel-1.1B training, indicating a drop in evaluation loss that was not mirrored by performance, judging that the model was ā€œcompletely braindeadā€ and advising against using tinyllama. @sanjiwatsuki compared their experiment with a TinyMistral model which exhibited a higher loss.

Links mentioned:

jondurbin: Weights & Biases, developer tools for machine learning

ā–· #model-merging (10 messagesšŸ”„):

  • Seeking Model Merging Tools: @givan_002 asked for scripts or resources to merge a fine-tuned 13B model using open source role-play datasets with other 13B models.
  • Advice on Merging Models with Same Architecture: @kquant advised that a 13B model can generally be merged with any model as long as they share the same architecture, such as merging Mistral with Mistral and Llama with Llama.
  • Ensuring Compatible Format for Merging: @kquant also mentioned the importance of ensuring that the model being merged follows the same format.
  • Mergekit As a Solution for Merging Models: @sao10k suggested using Mergekit for model merging needs.
  • Alpaca Format for Broad Compatibility: @sao10k explained that the Alpaca format is a ā€œsafe universal formatā€ and highlighted its popularity for merging 13B models, even if the model hasn’t been trained in Alpaca format.

ā–· #coding (2 messages):

  • AMD Enthusiasts Wanted: @spottyluck is seeking individuals who are running models on AMD systems without GPUs, using llama.cpp, to test the AMD AOCL blas and lapack libraries. This could help leverage AVX512 registers and optimize performance.
  • In Search of Downloads: @apcameron asked where to download the AMD AOCL blas and lapack libraries needed to conduct the tests that @spottyluck mentioned.

Nous Research AI Discord Summary

  • GPT-5 Rumors & Realism: A tweet by Sully Omarr sparked a discussion on GPT-5 with predictions of its impact and the skepticism on the novelty of multimodality. Users also debated the financial sustainability of SAAS startups running on venture capital with no subscription fees, citing a tweet questioning the business model.

  • Code Generation Innovations & AI Model Distribution: The introduction of AlphaCodium generated buzz, an open-source code generation tool surpassing humans in code contests, with its method and Github repository shared. Torrents were discussed as a potential model for AI model distribution, suggesting a decentralized dissemination method.

  • Fine-Tuning Techniques and Self-Rewarding Models: New fine-tuning techniques like SymNoise were highlighted for their ability to improve LLM performance, along with research on models that generate their own rewards, potentially leading to superhuman agents and suggesting a self-sustaining future for AI training.

  • Meta and the AI Space: Conversations about Meta’s LLaMa 3 and comparisons to GPT-4 reflected anticipation of AGI advancements and strategies, including the use of GPUs and a nod to Zuckerberg’s commitment to open source. The discussion touched upon the acquisition of hardware resources and potential impacts on model training capacity.

  • The Squircle Challenge & AI Aspirations: A math-related call-to-action took place around creating a squircle using bezier segments, with a Figma blog detailing the intrigue. Additionally, the personal growth story shared in a tweet from Teknium served as inspiration for AI newcomers seeking to grow their expertise in the field.

Nous Research AI Channel Summaries

ā–· #off-topic (22 messagesšŸ”„):

  • GPT-5 Anticipation Buzz: User @teknium shared a tweet suggesting GPT-5 is OpenAI’s next big launch, while @max_paperclips predicted a cycle of initial hype followed by performance nerfs.
  • Skepticism About Multimodality Hype: @teknium and @max_paperclips conveyed disinterest in multimodality aspects thought to be central to the upcoming GPT-5, with @teknium expressing it as ā€œmehā€ and @giftedgummybee hinting at expecting impressive performance due to available compute resources.
  • VC Funded SAAS Startup Costs Query: User @0xevil shared a tweet questioning the sustainability of a SAAS company offering with no subscription fees, leading @gabriel_syme to comment that the goal is creating ā€œgreat gateways,ā€ not necessarily products.
  • Proposing Torrents as a Distribution Model for AI Models: @everyoneisgross highlighted the potential of using torrents, as exemplified by Mistral, for distributing models, data, and instructions for machine learning applications.
  • Frustrated Over Misunderstandings of Model Fine-Tuning: In response to a tweet shared by @youngphlo expressing a claim that finetuning cannot add new knowledge to LLMs, @teknium showed clear frustration, asserting that finetuning does indeed add knowledge, to which @youngphlo sympathized as a justified reaction.

Links mentioned:

  • AGI depicted as ā€˜Samantha’ from ā€˜Her’: @burnytech shared a tweet by @Schindler___ proposing an AGI architecture modeled after Samantha from the movie ā€œHer,ā€ capable of dynamic speech, evolving personality traits, and external memory interaction.

  • Exploring AlphaCodium’s Capabilities: @metaldragon01 highlighted the introduction of AlphaCodium, an open-source code generation tool said to surpass most human competitors in code contests, and @teknium inquired about whether it functions as a general coding model or an applied layer over an existing model.

  • GitHub Project AlphaCodium Revealed: @adjectiveallison discovered AlphaCodium on GitHub, a method that enhances code generation accuracy by LLMs through a multi-stage, test-based iterative process, sparking a discussion on the use of iterative approaches in real-world applications.

  • Revolutionizing LLM Fine-tuning with SymNoise: @euclaise and @teknium discussed a new fine-tuning technique involving symmetric noise that improves LLMs, showing superior performance compared to prior methods on various models and datasets.

  • Self-Rewarding Language Models for Superhuman Agents: @metaldragon01 found research on Self-Rewarding Language Models where the models generate their own rewards, leading to improvements in instruction following and providing high-quality self-assessments during training.

Links mentioned:

ā–· #general (338 messagesšŸ”„šŸ”„):

  • Social Media Bot Skepticism: In a series of messages, @gabriel_syme and others discussed concerns about the utility of AI in social media, suggesting that while it may work for low-quality text outputs, it isn’t the best application of AI and may lack imagination in use cases. Users also joked about Twitter botting being the only valid use case.

  • AI Agents’ Future Discussed: The conversation switched to potential uses for agentic AI, including predictions of future customer service systems (@leontello). @.benxh added that less human involvement in social media management could be beneficial for humanity overall but expressed reservations about effects on marketing professionals.

  • Envisioning Code-Oriented AI Models: Chat participants discussed hopes for agentic AI use in code testing and development (@_3sphere), and the possibility of integration with multimodal models. They also commented on the challenges with certain models stopping mid-coding, highlighting a need for longer token sequences or step-by-step processing (@teknium).

  • Alignment Algorithms Compared: @osanseviero shared links to articles comparing DPO, IPO, and KTO alignment algorithms, concluding that DPO appears to be the best option overall but acknowledging the ease of scaling KTO due to its simpler data needs. Users discussed the correlation between various evaluations, with benchmarks and Elo scores being mentioned.

  • Meta’s LLaMa 3 and the Race for AGI: Meta’s training of LLaMa 3 sparked a discussion on potential advancements and how it might compare to OpenAI’s GPT-4. The conversation touched on the strategic use of resources like GPUs and the fascinating position of Meta’s CEO as a proponent of open-source developments (@gezegen, @_3sphere, @teknium).

Links mentioned:

ā–· #ask-about-llms (44 messagesšŸ”„):

  • OCR for Embeddings? A Brave New World: @_3sphere expresses a novel concept, suggesting the ability to OCR an embedding and pondering when a neural JPG could become reality, considering embeddings as a form of codec.
  • Math Collaboration to Solve Geometrical Challenges: @bernaferrari seeks a math enthusiast to tackle the problem of representing a squircle using bezier segments, as explained in a Figma blog post. They believe a proper mathematical representation could lead to fame on Hacker News and improve the field, as current approaches lack elegance.
  • LLMs in Geometry Generation: @gabriel_syme recalls past successes in generating geometry with LLMs, noting potential for iterative generation had the models been better at the time. Meanwhile, @mr.userbox020 discusses the depth of geometry and the applicability of LLMs to mathematical problems, suggesting a simple 2D vector approach could suffice.
  • The Squircle Quest: @mr.userbox020 skeptically addresses the use of LLMs for solving @bernaferrari’s squircle problem, urging a more traditional mathematical path over complex LLMs due to nature of the problem involving irrational numbers and infinite precision.
  • A Journey from Novice to Pro in AI: In a shared tweet from @Teknium1, a remarkable one-year transformation is celebrated, inspiring @quilalove to question where to begin their own journey into the AI field and collaborate with others on AI technical knowledge and implementation.

Links mentioned:


LM Studio Discord Summary

GPU Tango: VRAM and Resource Management in Focus: Engineers in the guild discussed GPU offload settings in LM Studio, noting that setting the GPU offload to -1 utilizes all layers but may show low GPU utilization. Recommendations were made for using Nvidia P40 GPUs as a cost-effective performance solution, and concerns were raised about potential VRAM allocation conflicts when running AI models alongside intensive applications like gaming.

LM Studio Beta V4 Debuts: Beta V4 (0.2.11 release candidate) of LM Studio has been released, featuring a model search page with VRAM fit estimates and support for new 2bit quants. Download links were provided, and it was stated that plans for open sourcing or adding a plugin system are in development, assuring LM Studio will remain free for personal use.

Dispatches from the Hardware Front: Relevant hardware discussions included power supply considerations for dual RTX 3090 setups, where a 1200W+ PSU was advised. Creative solutions for fitting large GPUs into small cases were exchanged, emphasizing the ingenuity of the engineers in optimizing their AI computing rigs.

CrewAI: Framework and Performance Insights: The CrewAI Multi-Agent Framework and its integration with the LM Studio API were highlighted, with a mention of leveraging specific agents for dedicated tasks like internet search. Benchmarks for multiple models using CrewAI were promised, along with sample code once the user’s work is completed.

Model Performance and Usage: It was reported that local models, albeit operational for repeated function calls, are not as impressive as the 3.5T model. The Skyrim ChatGPT mod’s image recognition was spotlighted as a parallel task that competes for GPU resources with other processes. LM Studio installation issues and an unspecified model error on a 24G RAM laptop also emerged, with the latter redirected to technical support channels for further assistance.

LM Studio Channel Summaries

ā–· #šŸ’¬-general (200 messagesšŸ”„šŸ”„):

  • GPU Offload and VRAM Utilization: @heyitsyorkie explained that setting GPU offload to -1 in LM Studio assigns all layers for GPU usage, though users like @4b0d3 reported seeing low GPU utilization. @senecalouck shared that ROCm beta could offer significant speed improvements for AMD cards.

  • Running LM Studio on Various Systems: @heyitsyorkie and @dagbs discussed running LM Studio on hardware like Macbook M1/2/3 chips and compared model performances between devices, noting that LM Studio was originally designed for MacOS M1/2/3.

  • Model Comparisons and Preferences: Users like @dagbs and @4b0d3 compared various models including Dolphin 2.6 DPO and Laserxtral, discussing preferences based on response quality and speed. @dagbs further noted that large models like Mixtral at Q6 can experience hallucinations at higher context sizes.

  • Remote Model Usage and Inference Server: @dagbs clarified that while LM Studio is not headless, it does have an Inference Server for remote model running. However, users like @leamac51_62244 sought discussions on using models remotely due to high hardware requirements.

  • LM Studio Installation Issues: @surrender encountered issues with LM Studio not launching post-installation. @dagbs suggested deleting the .cache/lm-studio folder and to seek more help in the appropriate support channel.

Links mentioned:

ā–· #šŸ¤–-models-discussion-chat (11 messagesšŸ”„):

  • GPU Concerns with Skyrim ChatGPT Mod: @gamerred inquired if LM Studio needs to run on a GPU due to the Skyrim ChatGPT mod also performing image recognition. @fabguy believes that both processes will compete for GPU resources.
  • LMs to Cause Minor Gameplay Hiccups: @dagbs explained that while compute and 3D are separate, language models (LLMs) may cause brief frame drops during their initial ā€œthinkingā€ stage, but not significantly affect general gameplay.
  • VRAM Allocation Might be an Issue: @fabguy pointed out that the real issue is the allocation of vRAM, even though @dagbs thinks games tend to ask for more resources than necessary.
  • Watch Out for Recommended Graphics Settings: @ben.com cautioned that games might not account for GPU VRAM already used by LLMs and as a result, one should consider reducing texture sizes or other settings accordingly.
  • LLM Background Operation During Gaming: @dagbs shared personal experience about running games with medium graphics requirements while keeping an LLM idle in the background, and @_anarche_ commented on maintaining high FPS in COD while running some 7B models, signaling a CPU bottleneck in their setup.

ā–· #🧠-feedback (6 messages):

  • Query on Beta Release Status: @logandark inquired about a potential delay of the new beta. Although no precise update on the release was issued, conversation suggested that the work might still be in progress.
  • Unspecified Model Error for User: @aindy_niu reported an issue while running lm-studio on a laptop with 24G RAM, facing an exit code and an unknown error. No solution was offered in the given exchanges.
  • Guidance Offered for Technical Support Channels: When @aindy_niu sought help for a model error, @dagbs redirected them to specific Discord channels, implying a better fit for technical support there.

ā–· #šŸŽ›-hardware-discussion (79 messagesšŸ”„šŸ”„):

  • Decent Performance on a Budget: @dagbs and @heyitsyorkie chimed in on the economical viability of using the Nvidia P40 for AI computing, noting it as a recommended cheap option if one has the setup to run them, offering decent performance with 24GB of VRAM, and even achieving ā€œsingle digit tok/s with multiple p40ā€ for large models like the 120b Goliath.
  • Power Supply for Dual 3090s: @rouw3n inquired about the PSU requirements for a setup with a second RTX 3090, to which @heyitsyorkie and others recommended a 1200W+ power supply, with .ben.com suggesting a 1000W might be adequate with some tweaking.
  • Integrating GPUs in Tight Spaces: Users like @dagbs, .ben.com, and @pefortin shared their experiences fitting large GPUs into smaller cases by repurposing space and using PCI extenders or laying hardware against other components, highlighting creative solutions for building compact yet powerful AI rigs.
  • Experimenting for Optimal GPU Load: @ericericericericericericeric engages in a discussion about experimenting with the GPU Offload layers settings for different model sizes, with advice from @heyitsyorkie to play around with layers and monitor VRAM usage, indicating there’s no one-size-fits-all setting.
  • Enhancing AI Performance in All-in-Ones: @jilloschwortz seeks to boost AI performance on an all-in-one PC with an i7 13700 and 16GB of RAM. @heyitsyorkie suggests saving up for a dedicated rig while @dagbs floats the idea of external GPU connections as a workable solution.

ā–· #🧪-beta-releases-chat (37 messagesšŸ”„):

  • Beta V4 Stepping Up: @yagilb announced that Beta V4 (0.2.11 release candidate) is out, featuring a new model search page with VRAM fit estimates, a bug fix for text pasting, and the latest llama.cpp commit. Users are encouraged to provide feedback on the new search page, and download links are available here.

  • 2bit Quant Innovation: In a brief exchange, @n8programs asked and @yagilb confirmed that Beta V4 supports new 2bit quants, showcasing excitement for the update.

  • ROCm Remains Separate for Now: @_anarche_ inquired about ROCm support, to which @yagilb replied that it is not yet integrated and will continue to be shared separately until integration is simplified.

  • Plugin Possibilities on the Horizon: When @n8programs queried about the prospect of open sourcing LM Studio for community contributions or adding a plugin system, @yagilb hinted that plans for this are in development.

  • Always Free for Personal Use: Amidst speculation about future pricing for LM Studio, @yagilb assured @n8programs that it will remain free for personal use, maintaining the current model.

ā–· #autogen (1 messages):

yagilb: https://discord.com/channels/1110598183144399058/1197707651438624849

ā–· #langchain (1 messages):

  • Local models performing well but not ā€œgreatā€: User @anarche_ remarked that they’ve had success with multiple local models in terms of handling function calls repeatedly. However, they noted that these models are not as impressive as the 3.5T model.

ā–· #avx-beta (1 messages):

  • Error: Additional Properties Not Allowed: User @_elchupacabras encountered an error stating ā€œError: must NOT have additional properties. File contains unknown property: ā€˜min_pā€™ā€ and is seeking solutions.

ā–· #crew-ai (10 messagesšŸ”„):

  • CrewAI Multi-Agent Frameworks in Action: @senecalouck discussed utilizing the LM Studio API with @<bot-id> for internet search and summarization within CrewAI. They implemented a strategy of aligning specific agents with individual tools, like search, while the rest of the crew used only LLM access.
  • Benchmarking Multiple Models with CrewAI: User @_anarche_ mentioned conducting benchmarks with CrewAI, testing several models, and promised to share results and sample code for the crew setup used once completed.
  • Question About Dolphin DPO Score: @dagbs inquired about the meaning of an asterisk (*) accompanying the Dolphin DPO score, expressing a specific issue with the Dolphin setup, forgetting to install requirements.
  • Dolphin Model’s Minor Setbacks: In response to @dagbs, @_anarche_ acknowledged that the Dolphin model ā€œdid the job but had a hiccup or two,ā€ hinting at some inconsistencies in performance.

Perplexity AI Discord Summary

  • Perplexity and Rabbit Unite: After a partnership with Rabbit OS, the first 100,000 Rabbit R1 purchases will include a complimentary year of Perplexity Pro, offering real-time, precise answers with the integration of PPLX online LLM API. Rabbit’s tweet also emphasized natural language search enhancements for r1 users.

  • Clarity on AI Models in Use: Perplexity reassured users that Perplexity Pro indeed employs genuine models including GPT-4 and Claude 2.1, with technical specifics detailed in their Technical FAQ. In particular, Copilot uses GPT-4 for Pro users, supported by a fine-tuned version of GPT-3.5.

  • Exclusive Offers Stir Excitement: A partnership reveal has sparked excitement with a $200 free Perplexity Pro credit offered to Rabbit r1’s first 100,000 buyers, confirmed in a tweet from Jesse Lyu, highlighting that Perplexity on Rabbit r1 will be free of subscription fees.

  • Free AI Tools Entice the Community: A shared YouTube video showcases ā€œ23 AI Tools You Won’t Believe are Free,ā€ incentivizing viewers with a one-month free Skillshare trial, while another video backs Perplexity AI as the preferred choice over other tools like Google for content creation, viewable here.

  • Community Help and API Interaction: Positive community interaction is highlighted with a user expressing appreciation for having a payment method issue resolved efficiently. However, users were informed that certain specific information and features are currently not available nor planned in the development roadmap, emphasizing the need for managing expectation with current capabilities.

Perplexity AI Channel Summaries

ā–· #announcements (1 messages):

  • Perplexity Partners with Rabbit: @ok.alex announced a collaborative partnership that integrates PPLX online LLM API with Rabbit R1 for real-time, precise answers. The first 100,000 Rabbit R1 purchases come with a complimentary year of Perplexity Pro.

ā–· #general (186 messagesšŸ”„šŸ”„):

  • Perplexity AI Model Clarifications: Users like @charlesalan sought confirmation on whether Perplexity Pro uses genuine models like GPT-4 and Claude 2.1. @icelavaman provided assurance and a link to clarify these details, affirming the authenticity of the models used.
  • Details on Copilot Model: In response to queries from @gpt_five, @icelavaman shared a Technical FAQ detailing that Copilot uses GPT-4 for Pro users and is routed by a fine-tuned version of GPT-3.5, emphasizing its capabilities for in-depth answers.
  • Exciting Partnership Announcement: @otub revealed a partnership that provides $200 of free Perplexity Pro credit to the first 100,000 buyers of Rabbit r1, a deal confirmed by various users including @glap and @ok.alex, who noted the credit would extend even current Pro subscriptions.
  • Clarifying R1 with Perplexity Pro: @dan9070 cited a Twitter post from @jessechenglyu that confirmed R1 will have Perplexity on rabbit r1 for free without any need for a subscription—a significant boon for early adopters of the device.
  • User Engagement and Support: @lkshrc and @yogable inquired about acquiring Pro Discord access, which was promptly resolved by @icelavaman, showcasing the community support and responsiveness within the Perplexity AI server.

Links mentioned:

  • Chat with Open Large Language Models: no description found
  • Tweet from Jesse Lyu (@jessechenglyu): key msg: 1. Perplexity on rabbit r1 is FREE. 2. Perplexity offers free $200 credit as a FREE GIFT to first 100K r1 orders. 3. rabbit r1 REMAINS free of subscription. conclusion: WHAT A DEAL! ā†˜ļø Quot…
  • What is Perplexity Copilot?: Explore Perplexity’s blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
  • Perplexity Blog: Explore Perplexity’s blog for articles, announcements, product updates, and tips to optimize your experience. Stay informed and make the most of Perplexity.
  • Chat Completions: no description found
  • Perplexity - AI Companion: Ask anything while you browse
  • What models does Copilot use?: Dive deep into Perplexity’s technical details with our comprehensive FAQ page. From the nuances of AI models like GPT-4 and Claude 2 to token limits and AI profiles, get concise answers to optimize yo…

ā–· #sharing (5 messages):

  • Free AI Tools on YouTube: @siddhj shared a YouTube video titled ā€œ23 AI Tools You Won’t Believe are Free,ā€ which showcases a variety of AI tools available at no cost. The video’s description mentions a partnership with Skillshare for a one-month free trial.
  • Commendation for Riley Brown’s Video: @samangel7358 acknowledged the efforts of Riley Brown by applauding another informative AI-related video.
  • Rabbit Partners with Perplexity AI: @br0k3r81 highlighted a new partnership between rabbit OS and Perplexity AI, shared via a tweet from @rabbit_hmi, aimed at improving the natural language search capabilities for r1 users.
  • Perplexity AI Service in Action: @almost.engineering posted a link demonstrating the search capabilities of Perplexity AI for specific content related to Gannon Makerspace.
  • Personal Preference for Perplexity AI: @oneisall_ shared a YouTube video where the creator explains why they favor using Perplexity more than Google, ChatGPT, BARD, and Microsoft Copilots, particularly for content creation.

Links mentioned:

ā–· #pplx-api (6 messages):

  • Gratitude for Problem Resolution: User @rxiiia expresses appreciation towards @830126989687914527 for assistance with a payment method issue which was resolved without the need to recreate the method.
  • Encouraging Community Recognition: @Dyno suggests using the ⭐ emoji to react to helpful messages. Accumulating five stars sends the message to the ⭐│starred channel and earns the author the EXPLORER role.
  • Request for More Specific Instructions: @dvrshil asks for more specific details or instructions, expressing that the current help is inadequate.
  • Limitation on Information: @icelavaman responds to @dvrshil with a straightforward refusal, claiming that providing the specific requested information or details is not possible.
  • Feature Not on the Roadmap: User @icelavaman informs @dvrshil that the feature in question is not currently on the development roadmap.

OpenAccess AI Collective (axolotl) Discord Summary

  • Mistral Model Mayhem: Model performance and model training emerged as focal points, with discussions ranging from the best 7b Mistral models to use, like OpenPipe/mistral-ft-optimized-1227 and Bagel 7B, to the challenges of sample packing in models including LoRA/qLoRA and Axolotl. Users critically explored data quality and dataset effectiveness, proposing RedPajamaV2 and Dolma for model testing, and emphasized Meta’s acquisition of 600,000 Nvidia H100 GPUs to illustrate the growing computational scale in AI like LLaMa 3.

  • Pack and Roll with Axolotl: In Axolotl developments), conversations focused on updating package requirements for flash-attn, the lack of direct configuration in DPOTrainer, and concerns over package dependency management. Users noted ColossalAI’s ShardFormer as a potential step toward simplified tensor parallelism and questioned the veracity of Unsloth’s claims regarding training speed and VRAM efficiency.

  • Plotting with Qlora and LoRA: Inquiries were made about implementing Qlora to replicate specific research results, and there was questioning about a resolved bug regarding 8bit LoRA tuning in Mixtral.

  • Dataset Utilization and Cleanup Convos: Users showed surprise over the underutilization of oasst1/2 datasets and shared effective data cleanup strategies using GPT-4 and mistral-medium. They discussed the strategic selection of training tokens, such as <BAD> vs <GOOD>, emphasizing the impact of token choice on model training outcomes.

  • RLHF Ruminations: Dialogues in rlhf deliberated the potential stability of an input + label + output training method over DPO, considering its utility for improving model stability, with specific mention of its use within FAANG companies.

  • Replicate Hosting and API Considerations: Queries in replicate-help touched on whether the platform supports hosting and pondered setting up API connections to models.

OpenAccess AI Collective (axolotl) Channel Summaries

ā–· #general (140 messagesšŸ”„šŸ”„):

  • Model Merger’s Labyrinth: @le_mess asked for recommendations on the best 7b Mistral model trained with chatml format amid confusion on the leaderboard. @dreamgen and @bozoid. discussed various merged models like OpenPipe/mistral-ft-optimized-1227 and the uniqueness of Bagel 7B, whilst expressing dissatisfaction on mixed prompt format training and data quality issues.
  • Sample Packing Conundrums: @tiendung enquired about the effectiveness of sample packing with different types of models, such as LoRA / qLoRA, while @dreamgen discussed potential issues with Hugging Face’s implementation, particularly with attention mask and positional encoding. @tiendung and @nanobitz explored whether Axolotl correctly implements sample packing compared to Hugging Face’s approach.
  • Datasets Over Models: @bozoid. expressed a desire to see models tested against datasets like RedPajamaV2 and AllenAI’s Dolma. @bozoid. and @nruaif conversed about the challenging nature of training on huge datasets and ambitions to downscale models without compromising performance.
  • The Might of Meta’s Compute Arsenal: @yamashi, @noobmaster29, and @casper_ai discussed Meta’s massive acquisition of 600,000 Nvidia H100 GPUs for training LLaMa 3, highlighting the intense scale of resources involved in state-of-the-art AI training endeavours.
  • DPO Training Trials and Tribulations: @c.gato and @dangfutures encountered obstacles and shared experiences in applying DPO (Decentralized Parallel Optimization). Their dialogue revealed uncertainties and learning moments while attempting to improve their models’ training processes.

Links mentioned:

ā–· #axolotl-dev (26 messagesšŸ”„):

  • Requirement Update for flash-attn: User @louist4455 pointed out that flash-attn==2.3.3 might be outdated, requiring a newer version for LLM FT. @caseus_ acknowledged that upgrading is a manual process due to the lack of automated testing for multi-GPU support.
  • Configuration Query in DPO Cleanup Branch: @filippob82 asked why certain parameters like max_length and max_prompt_length are not directly configurable in the DPOTrainer. @caseus_ indicated that for most architectures in use, these settings are not crucial, but opened to adjustments following an example from a GitHub script.
  • Axolotl Package Dependency Concerns: @faldore raised a question about preventing Axolotl from installing cuda and torch, which they prefer to handle independently. @caseus_ noted the need to reconsider why bert-score was added as a dependency, while @nanobitz advised commenting out undesired installs from the requirements.
  • Interest in Tensor Parallelism with ShardFormer: @caseus_ shared a link to ColossalAI’s ShardFormer, potentially hinting at easier tensor parallelism integration, pointing to the GitHub page of the ShardFormer project.
  • Skepticism About Unsloth Speed Claims: @nanobitz shared a Reddit post about Unsloth’s performance improvements and VRAM reduction for finetuning models. @caseus_ expressed skepticism on the marketing numbers, mentioning the ability to train in under an hour on a 3090 GPU and clarifying that transformers implemented 4d attention masks, not packing support.

Links mentioned:

ā–· #general-help (2 messages):

  • Planning with Qlora: User @jacques_10431 mentioned that their team is planning to utilize Qlora in an effort to replicate the results of a particular article.
  • Inquiring about 8bit LoRA Tuning Bug: @jaredquek asked for updates regarding a bug related to 8bit LoRA tuning within Mixtral as compared to qLoRA or fft, which was initially raised by user Caseus. They are curious if the issue has been successfully fixed.

ā–· #datasets (6 messages):

  • Surprising Underuse of OASST Datasets: @dreamgen expressed surprise over the lack of models utilizing the oasst1/2 datasets and mentioned their potential after some filtering.
  • Inquisitive on Deep Learning Fine-tuning: In a follow-up, @dreamgen asked for details about training with 20 samples, including the use of DPO with QLora, learning rates, and other specifics.
  • Advocating for GPT-4 Data Curation: @dreamgen recommended investing in GPT-4 for data cleanup, highlighting its importance in contrast to the costs of fine-tuning and inference.
  • Clarity Through Examples: @dreamgen asked for examples to better understand the data cleanup goals.
  • Experience in Data Cleanup & Augmentation: In a reflective note, @dreamgen shared that mistral-medium proves to sometimes be enough or even surpass GPT-4 Turbo in certain data cleanup and augmentation tasks, while in others, GPT-3.5 Turbo outperforms mistral-medium.
  • Acknowledgement of GPT-4 Efficiency: .____init___ agreed with @dreamgen on the sensibility of using GPT-4 for a one-time data cleanup process.

ā–· #rlhf (5 messages):

  • Comparison between DPO and Baseline Method: @dreamgen discussed potential benefits of using input + label + output training, contrasting it with Direct Policy Optimization (DPO) and suggesting it as a more stable approach, mentioning its usage at FAANG companies.
  • Labels for Model Training: @dreamgen explained the concept of using tokens such as <BAD> vs <GOOD> to distinguish between types of responses within training data, indicating that natural tokens might be more effective than synthetic tokens in practice.

ā–· #replicate-help (2 messages):

  • Hosting Inquiry: User @dangfutures asked if the platform is meant for hosting purposes.
  • API Setup Possibility: @noobmaster29 believes that setting up an API to the models is possible and considers trying it out later.

OpenAI Discord Summary

  • AI Integration in Smart Ecosystems: Enthusiasm grows as @.pythagoras and others discuss AI integration in smartphone models, like the Samsung S24, anticipating similar AI features in future Pixel phones. Debates unfold around Apple’s ecosystem versus Samsung’s AI capabilities, with predictions of AI becoming a default feature in the tech landscape.

  • Ethical AI Debates and Documents: The AI community engages in discussions about AI ethics, governance, and alignment, referenced through shared links to an arXiv paper and a WHO document on the governance of AI in health.

  • GPT-4 Community Contributions and Concerns: @serenejay reports verification issues with GPT Store and inquires about privacy options while @marcus_73 seeks feedback for their HopeGPT, and @russellsapalmer warns against developers ripping off GPT apps. Suggestions include domain verification to protect privacy and a call to OpenAI to monitor such activities, alongside reminders of OpenAI Status for service updates.

  • Prompt Engineering Strategies and Exchanges: Experiences vary on the use of custom instructions with AI models; @realgavok deems them inconsistent while @darthgustav. suggests XML tagging to improve GPT-4’s selection accuracy, sharing an XML tagging example for clearer model guidance.

  • XML Tagging Takes Center Stage: In prompt engineering discussions, @darthgustav. advises @magiciansinc on using XML tagging for better results over CSV or JSON. An example is provided, showcasing a way to optimize AI’s performance in filtering lists by using well-structured criteria.

OpenAI Channel Summaries

ā–· #ai-discussions (110 messagesšŸ”„šŸ”„):

  • AI-Enhanced Smartphones Stir Excitement: @.pythagoras expresses interest in the AI tools integrated into new smartphone models like Samsung S24 and hopes Google will follow suit with similar features in Pixel phones. Others share their experiences and preferences, with the conversation turning into a general discussion on the merits of Samsung vs. Apple, and the anticipation of AI becoming a staple feature in smartphones.

  • AI Fridge Fantasies Spark Imagination: Chatter @.pythagoras humorously foresees a future where all appliances will boast ā€œAI capabilities,ā€ leading to a series of creative speculations on conversational refrigerators and multi-functional vending machine-like kitchen appliances from other users.

  • AI Ethics & Governance Discussion: @clockrelativity2003 shares a link to an arXiv paper discussing AI and case law, and another link to a WHO document on the ethics and governance of AI in health, eliciting responses and discussion about AI alignment and its implications.

  • Gemini Ultra Release Uncertain: In a discussion on the release of ā€œGemini Ultra,ā€ @la42099 humorously guesses it could be out in the next 30 days, with users expressing hopes for larger prompt limits and other advancements.

  • The Tech Ecosystem Debate: A lively debate unfolds about Apple’s ecosystem and continuity features, with @muyfashionista extolling the benefits of seamless integration across Apple devices. Users @mrcrack_ and @darkangel9365 chime in with opinions on Android and Samsung capabilities, citing options for customization and questioning Apple’s policies on app approvals.

Links mentioned:

ā–· #gpt-4-discussions (38 messagesšŸ”„):

  • Verification Troubles for serenejay: @serenejay reported issues with not being able to complete the builder profile for GPT store publishing, despite trying different browsers and clearing cache. They managed success after subscribing via web with a card, but inquired about the possibility to not use their real name due to privacy concerns.

  • Domain Verification as a Solution: @rjkmelb suggested that @serenejay obtain a domain name to verify their OpenAI account after facing issues with Google Play verification. @7877 added that verifying with a domain can help hide one’s real name, showing the domain instead.

  • HopeGPT Wins a Competition: @marcus_73 shared their GPT model, HopeGPT, which won a competition for instilling hope, and requested feedback for improvement; link to the model was provided and they were guided by @solbus to share in a dedicated channel for visibility.

  • Alert on Developers’ GPT Ripping: User @russellsapalmer raised a serious concern about the developer account tapgpts allegedly copying the work of hundreds of developers, mimicking names, logos, descriptions, and sample prompts without credit, calling for OpenAI to monitor such activities.

  • ChatGPT Downtime and Communication: @c6565 questioned why outages of ChatGPT services are not publicly communicated, to which @7877 responded by providing a link to OpenAI’s status page, where operational updates and past incidents are detailed.

Links mentioned:

OpenAI Status: no description found

ā–· #prompt-engineering (16 messagesšŸ”„):

  • Custom Instructions Yield Mixed Results: User @realgavok observed that disabling custom instructions seems to enhance consistency. This sparked discussions, with @darthgustav. suggesting that the effectiveness of custom instructions varies heavily based on their content and structure.

  • XML Tagging Boosts GPT-4’s Selection Accuracy: In a tip to @magiciansinc, @darthgustav. recommended using XML tagging to improve GPT-4’s performance when sorting lists based on criteria, like picking cities ideal for a tropical vacation. The technique is claimed to be superior to using CSV or JSON formats.

  • Sample XML Tagging Provided by Darthgustav.: Further assisting @magiciansinc, @darthgustav. provided an example of XML tagging, listing various cities and associated activities to demonstrate how tagging could be utilized to enhance GPT-4’s output.

  • Using Discord Links to Share XML Format: In an unconventional move, @darthgustav. directed @magiciansinc to Discord links for examples, specifically through https://discord.com/channels/974519864045756446/1192857176562208798/1192857176562208798, which was part of the assistance provided.

  • Continuing the XML Tagging Exploration: @magiciansinc expressed intent to test the XML tagging method and @darthgustav. wished them luck, indicating a collaborative environment in the prompt-engineering channel.

ā–· #api-discussions (16 messagesšŸ”„):

  • Consistency in Custom Instructions: @realgavok raised a query about the effectiveness of custom instructions, noting that disabling them sometimes results in more consistency. @darthgustav. responded, indicating that consistency varies based on the content and quality of the instructions.
  • Advocating for Custom GPTs: @darthgustav. shared their preference for exclusively using Custom Instructions or Custom GPTs, implying satisfaction with their performance.
  • Enhancing List Filtering with Criteria: @magiciansinc is looking for advice on using GPT-4 to filter lists (e.g., cities or products) based on specific criteria. They reported receiving poor suggestions and explanations from the model so far.
  • XML Tagging for Better Results: @darthgustav. advised @magiciansinc to use XML tagging to note a city’s general properties, which should improve GPT-4’s performance. They also emphasized the importance of guiding the model properly.
  • Example of XML Tagging: When @magiciansinc asked for an XML tagging example, @darthgustav. provided a detailed sample, suggesting it may perform better than CSV or JSON based on their testing. They also referenced an external source for generating such data.

Mistral Discord Summary

Self-hosting and API wonders with Mistral 7B: Discussions across channels showed interest in self-hosting Mistral 7B and utilizing it with Python applications, with various users offering assistance and tool suggestions. Concerns around commercial application data privacy and technical issues with quantization affecting performance were raised.

The Quandary of Long Texts: Users debated on processing long texts with Mistral and the 32K token limit. While documentation mentions this limit, the practical token cap varies based on model size and task-specific conditions.

Frustrations and Recommendations in Fine-Tuning: The community reported challenges when fine-tuning Mistral 7B, such as persistence of old prompt responses and GPU memory difficulties on an RTX 4090. Additionally, the correct implementation for Mistral in the HF trainer and finding a good GGUF format model were subjects of inquiry.

Hearty Discussions on Deployment and Tool Integration: Participants exchanged experiences with integrating tools such as Deep Chat, highlighting its simplicity over more complex setups like Open Copilot. Personal experiences related to open-source projects and international moves in the tech sector were also shared among members.

Guidance for Aspiring Coders and LLaMa Musings: Recommendations for beginner coders pointed towards Harvard’s CS50 course and learning through hands-on experience. Curiosity was piqued by a Reddit discussion about Meta AI’s LLaMa 3 being trained on an impressive array of 600,000 H100s.

Mistral Channel Summaries

ā–· #general (31 messagesšŸ”„):

  • French Flair for Mistral 7B Self-Hosting Inquiry: User @bot7_ asked if it’s possible to self-host Mistral 7B and use it with a Python app. @kerunix confirmed it’s possible, while @tom_lrd noted it depends on the user’s OS and hardware, offering names of several relevant tools.

  • Pondering Long Texts for Mistral Processing: @lukasgutwinski inquired about the best way to process long texts (up to 100 pages) with Mistral, and whether Mistral Medium and Small both have a 32K token window. @i_am_dom suggested that Mixtral works effectively up to 16k tokens, but might not be stable beyond that threshold.

  • Seeking Easy Chatting with Mixtral: User @rod_____ wondered if there’s a way to chat with Mixtral by simply inserting an API key, to which @jortega_17718 responded with a link to the Hugging Face endpoints.

  • Langchain with Mistral API Clarification: @western_01 shared a success in using Mistral API with CrewAI via langchain, correcting an earlier mistake by pointing out the default API endpoint works perfectly.

  • 32K Token Limit Confirmation and Caveats: Both @jortega_17718 and @sublimatorniq addressed the alleged 32K token limit for Mistral’s generative endpoints, noting that while the documentation states this, practical limits often fall short, especially for smaller models or specific tasks.

Links mentioned:

mistralai/Mixtral-8x7B-Instruct-v0.1 Ā· Hugging Face: no description found

ā–· #models (22 messagesšŸ”„):

  • Self-Hosting Mistral Challenges: User @bot7_ inquired about how to self-host Mistral 7B and use it with a Python app, apologizing for their English as they are French.
  • Searching for Mistral 7B API: @rohit3389 started using ā€œMistral-7b-openorca.q4_0.ggufā€ via the GPT4All Python library and wondered if there is an API they could use with Python.
  • Clarification on Mistral’s Models: @tom_lrd responded that third-party servers would be needed to use specific finetunes like Openorca since Mistral’s API only serves models such as mistral7b-instruct.
  • Understanding LLMs and Seeking API Solutions: @rohit3389 seeks a faster API solution to avoid loading a heavy 4GB model and @tom_lrd suggests using tiny, small, and medium models through Mistral’s API, which won’t be exact in style but should be comparable or better.
  • Suggestions for Off-Guardrails Models: @dizzytornado seeks a Mistral model suitable for writing scripts with realistic, conflicted characters rather than happy, harmonious scenarios — @vhariational recommends the Dolphin-Mistral models with a link to their Hugging Face page and a Discord invite for further discussion.

Links mentioned:

cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser Ā· Hugging Face: no description found

ā–· #deployment (8 messagesšŸ”„):

  • Self-Hosting Mistral 7B Inquiry: @bot7_ asked if it’s possible to self-host Mistral 7B for use with a Python app, despite being unsure about their English proficiency. @akshay_1 confirmed that it is possible and reassured @bot7_ about their English.

  • Offering a Helping Hand in Deployment: @akshay_1 acknowledged the complexity of self-hosting Mistral 7B and offered expertise by asking @bot7_ to check their DMs for further assistance.

  • Concerns About Data Privacy with Commercial Application: @xxtinction expressed concerns about the utilization of Mistral 7B in a commercial application with sensitive data, questioning if the data will remain private or used by Mistral for training. They requested documentation clarification due to confusion with Mistral’s Privacy Policy.

  • Technical Issue with Quantization in Mistral 7B: @lauthu mentioned encountering an accuracy drop using TensorRT-LLM W8A8 Smooth Quant on Mistral 7B and is inquiring if others have experienced similar issues.

ā–· #finetuning (7 messages):

  • 7B Mistral in GGUF Format Inquiry: @bot7_ is searching for a good 7B Mistral in GGUF format but did not receive any response to the query.
  • Issue with Mistral in the HF Trainer: @bozoid. shared concerns about an incorrect implementation for Mistral in the HF trainer affecting performance during finetuning, which has yet to be officially addressed, according to @andrewwwwme.
  • Persistent Old Prompt Responses in Mistral: @dizzytornado reported an issue where Mistral keeps returning words from an old prompt, but did not receive a solution.
  • Finetuning Mistral 7B Challenges on RTX 4090: @kaizen0340 inquired about experiences finetuning Mistral 7B with LORA on an RTX 4090, mentioning difficulties with GPU memory. @enzodeg40 responded by asking if the CUDA_VISIBLE_DEVICES was configured correctly.

ā–· #showcase (104 messagesšŸ”„šŸ”„):

  • Deep Chat’s Ease of Integration: @ovi8773 celebrated the simplicity of integrating Deep Chat—with only one line of code and no sign-up, it beats setting up full-stack alternatives like Open Copilot, which requires more extensive configuration.
  • Open Copilot’s Complex Setup: In contrast to Deep Chat’s ease of use, @ovi8773 remarked that Open Copilot’s setup process is cumbersome, despite being an open-source project with customizable options. They considered Deep Chat’s superior in terms of developer convenience and implementation.
  • Accolades for Project Contribution: Deep Chat garnered admiration from @ethux, who appreciated the project enough to give it a star on GitHub, sharing enthusiasm for open-source contributions.
  • Discussing Global Tech Hubs: The conversation extended into a discussion about the cost of living and tech hubs across the globe. @ovi8773 and @ethux exchanged insights on housing prices, the appeal of various countries, and tax incentives such as the 30% tax ruling in the Netherlands.
  • Personal Journeys in Tech: @ovi8773 shared their personal experience of taking a career break from Software Engineering to focus on open-source projects, as well as contemplating a move to a different country. This sparked a discussion with @ethux about the pros and cons of relocating, especially in the context of the tech environment and living standards.

Links mentioned:

ā–· #random (4 messages):

  • Beginner Coding Advice Sought by @fufuespade: User @fufuespade inquired about how to start learning coding and which resources, such as forums or YouTube channels, would be recommended for beginners.
  • Harvard Coding Course Recommended: @jakobdylanc suggested checking out CS50 on YouTube, a free Harvard course with comprehensive lecture videos, for beginner coders.
  • Hands-On Learning Approach by @akshay_1: @akshay_1 advised @fufuespade to learn coding by directly implementing an idea and gaining practical experience.
  • Conversation on Meta’s LLaMa: @yamashi shared a Reddit link discussing Meta AI’s large language model, LLaMa, and Mark Zuckerberg’s comment on training LLaMa 3 on 600,000 H100s.

Links mentioned:

Reddit - Dive into anything: no description found


HuggingFace Discord Discord Summary

  • DeciTech Drops Dual Model Delights: DeciTech released the DeciCoder-6B, supporting eight programming languages, and DeciDiffusion v2.0, an image generation model boasting 2.6x speed over Stable Diffusion v1.5. Explore DeciCoder-6B on Hugging Face and test them on Colab or Hugging Face Space.

  • FABBLER.AI Calls for Creative Testers: FABBLER.AI is seeking beta testers for an innovative tool that crafts narrative stories, convertable to videos. Check out the demo on YouTube and explore the tool on Hugging Face Space for Proteus-V0.1.

  • GPU Hosting for Heavy Models? EU Wants to Know!: A member is compiling a list of GPU hosting providers in the EU capable of supporting 13B to 70B models for tasks such as image-to-text and email triage. The request is for low latency and on-demand use, with no specific providers or solutions provided in the discussion.

  • Phi-2 Model Weights, Beware the Exclamation Invasion: After an update to the Phi-2 model, a user experienced issues with FP16 inference, leading to exclamation mark outputs, resolved by switching to device_map="auto". Details for developers facing similar issues can be found here.

  • Struggle of the Syntax and Model Queries in Computer Vision: Some users encountered syntax errors while training models, resolved by community suggestions, while others sought advice on object tracking without a response documented. A beginner working with Indian food datasets received guidance from peers on how to move forward.

HuggingFace Discord Channel Summaries

ā–· #announcements (1 messages):

  • DeciTech Unveils DeciCoder-6B and DeciDiffusion v2.0: Two new models have been introduced: DeciCoder-6B, which supports eight programming languages and outperforms competitors in HumanEval benchmarks, and DeciDiffusion v2.0, an image generation model that is 2.6 times faster than Stable Diffusion v1.5. Examine the details on DeciCoder-6B and try them out in Colab and Hugging Face Space.

  • Revving Up Vehicle Speed Estimation: @SkalskiP presents a tutorial on real-time vehicle speed estimation, involving vehicle detection using YOLOv8, tracking with ByteTrack, and complexities of distance calculation. Catch the tutorial here.

  • Fighting Hallucinations in Language Models: A new research discusses detecting and editing hallucinations in language model outputs, introducing a retrieval-augmented model (FAVA) that outperforms ChatGPT and LLama2 Chat. Discover the taxonomy, model, and demo on the project website.

  • Art and AI: A Creative Partnership: @fffiloni writes on the critical role of art and design in advancing AI capabilities, encouraging collaboration between artists, designers, and AI researchers. Read the full article on the Hugging Face Blog.

  • Embracing French Text With Lyon NLP Group: lyon-nlp-group extends the Massive Text Embedding Benchmark (MTEB) to French, aiding the evaluation and comparison of text embedding methods in the French language. The detailed analysis is available in the blog post.

Links mentioned:

  • [@harpreetsahota on Hugging Face: ā€œāœŒšŸ¼Two new models dropped today šŸ‘‡šŸ½

  • šŸ‘©šŸ¾ā€šŸ’» šƒšžšœš¢š‚šØššžš«-šŸ”šā€¦ā€](https://huggingface.co/posts/harpreetsahota/814290289723145): no description found

  • [@SkalskiP on Hugging Face: ā€œReal-Time Vehicle Speed Estimation Tutorial šŸš—šŸ’ØšŸ’ØšŸ’Ø

TL;DR: Watch theā€¦ā€](https://huggingface.co/posts/SkalskiP/421333989856413): no description found

  • [@s3nh on Hugging Face: ā€œGPU Poor POV: Building a RAG which solves specific task.

Everyone lovesā€¦ā€](https://huggingface.co/posts/s3nh/683576905550627): no description found

ā–· #general (79 messagesšŸ”„šŸ”„):

  • Phi-2 Model Weights Troubles: User @admin01234 described an issue with the Phi-2 model where after updating files, only exclamation marks were being generated. A solution mentioned was to switch from torch_dtype="auto" to device_map="auto" in the model’s configuration. The problem along with a code snippet was discussed in this forum post.

  • BERT Model Token Limits: @redopan706 inquired about modifying the maximum token limit of the BERT Model, to which @stroggoz suggested that they Read the documentation on huggingface, indicating that model configuration details can be found there. Another user, @vipitis, suggested looking for a different pre-trained model with a larger context size than attempting to retrain or interpolate.

  • Fine-Tuning Model Bit Size Concerns: @samuelcorsan sought advice on converting a model from 4-bit to 8-bit quantization. The discussion with @doctorpangloss revealed that backpropagation with 8-bit might not be practical, and they suggested using LoRA training in bf16 or fp32 instead.

  • AI Generated Portraits on macOS: @itscharliecrown expressed a desire to train an AI with personal images to generate portraits using the Stable Diffusion Web UI-UX. In response, @doctorpangloss noted the feasibility of training on macOS but warned about the significantly reduced speed compared to platforms that support CUDA like Windows or Linux.

  • Hugging Face System Outages: Users @theyruinedelise and @jo_pmt_79880 reported Hugging Face platform outages, experiencing 504 errors and website loading issues, humorously suggesting ā€œhungry hamsters… nibbling on the wiresā€ as a possible cause for the downtime.

Links mentioned:

ā–· #today-im-learning (4 messages):

  • Early Bedtime for a Knowledge Seeker: User @mastermindfill expressed appreciation and mentioned plans to save provided links for future use before heading off to bed. No specific links or topics were discussed in these last messages.

ā–· #cool-finds (4 messages):

  • SDXL V2 Models Released: @_vargol stated that h94 has released version 2 models for SDXL, which show improvements but still require a bias towards photorealism.
  • ZavyChromaXL Recommendations: @meatfucker mentioned having great results with the zavychromaxl models on the previous SDXL version, although they haven’t tried the new one yet.
  • Flexibility of Zavy Models: Continuing the discussion, @meatfucker noted their success in achieving both realistic and cartoony outputs using the zavychromaxl model.

ā–· #i-made-this (7 messages):

  • FABBLER.AI Seeks Beta Testers: @piotr_fabbler.ai is calling for beta testers to try out a new AI tool designed for creating narrative stories that can be exported as videos. Interested users can contact Piotr for a unique storytelling experience and provide feedback, with a brief showcase video available here.

  • Proteus-V0.1 Launched on Hugging Face Spaces: @ehristoforu shared a link to the new Hugging Face Space Proteus-V0.1 that runs on zerogpu. @osanseviero commented, showing interest and inquiring about the zerogpu experience.

  • Curiosity About Model Improvement: User @merve3234 inquired whether there has been an improvement in @ehristoforu’s model compared to the previous version that uses 1.5, indicating interest in the model’s development progress.

  • Suggestion for Displaying Upscaled Images: @lunarflu complimented @ehristoforu on the simple and effective nature of their model and suggested an enhancement to display both original and upscaled images side by side for a better comparison.

  • AI Playgrounds and Model Experiments on GitHub: @vishyouluck shared their GitHub repository vishalmysore/AI which serves as a playground for AI examples using different models. They invited others to explore and share their thoughts on the repository’s content.

Links mentioned:

ā–· #diffusion-discussions (3 messages):

  • Seeking Help for Loading Animated Models: User @latentspace inquired about the possibility of loading animated models from a single .ckpt or .safetensord file for new stable diffusion versions. In response, @sayakpaul suggested opening a discussion on GitHub and promised to involve relevant experts in the query.
  • Exploring GPU Hosting Options for Large Models: @johntdavies is seeking advice and a comprehensive list of GPU hosting providers that support hosting 13B to potentially 70B models, with use-cases spanning image-to-text in messaging and email triage, with a preference for EU-based services. He is currently gathering data to create a proposal.

ā–· #computer-vision (27 messagesšŸ”„):

  • Syntax Slip-Up: User @swetha98 encountered an error when trying to train the donut docvqa model and shared the traceback log. @gugaime pointed out that there might be a typo with an unnecessary backslash (\) in the code string, and suggested adding a space.

  • Object Tracking in CV: @curiousbro inquired about a good Python computer vision model for tracking objects and collecting data, but did not receive a response in the provided message history.

  • Journey Through Notebook Troubles: @xeus69 had issues running a notebook and installing accelerate, a detail highlighted by @meatfucker who suggested reviewing error messages and ensuring the correct version is installed. The issue was eventually resolved after @xeus69 cleared the notebook cache.

  • First-time Dabble in Machine Learning: Newcomer @xeus69 mentioned being a beginner and getting assistance from @meatfucker on initial forays into machine learning using Colab. Discussion indicated @xeus69 is working on something involving Indian food, uncovered by @meatfucker deducing from an output directory.

  • Captioning Models Discussion: @merve3234 questioned @xeus69’s choice of models for captioning over a more grounded model like KOSMOS-2, hinting at a need for accuracy in captions, relevant for document understanding tasks. There was no recorded response from @xeus69 on this query.

ā–· #NLP (2 messages):

  • Cache Configuration for Transformers in Docker: @asprtnl_50418 provided a snippet on how to change the cache directory for Transformers within a Dockerfile by setting the TRANSFORMERS_CACHE environment variable. They also included instructions on how to mount a volume to link the local cache to the container cache when starting a Docker container.

ā–· #diffusion-discussions (3 messages):

  • Inquiry about loading animated .ckpt models: @latentspace asked if it’s possible to load animated models from a single .ckpt or .safetensord file, mentioning versions for SD v15 and SDXL for use with an animatediff pipeline, but did not provide further details on their setup or context.
  • GitHub Discussion Suggestion: @sayakpaul responded to @latentspace, suggesting opening a discussion on GitHub and providing some links so that they could tag relevant contributors to assist with the question.
  • Looking for GPU Hosting Options: @johntdavies sought recommendations for a discussion group or thread regarding GPU hosting services, particularly in the EU, for running 13B and possibly 70B models, with needs varying from low latency for image to text in messaging to on-demand use for email triage and replies. They are also seeking a list of companies to prepare for a proposal.

LAION Discord Summary

  • DougDoug’s AI Comedy Unleashed: Discussion indicated that YouTuber DougDoug has created an AI character with NSFW elements using ChatGPT, alongside ElevenLabs for voice generation. The AI-centric comedy approach is facilitated by his open-sourced project on GitHub.

  • AI Parody Law Panic: A controversial ā€œNo AI FRAUDā€ Act, seen as potentially unconstitutional, spurred discussion about its significant impact on parody and comedic AI content. An informative breakdown on the implications was provided in a Reason.com article.

  • Language Rules Linguistic Rumble: Prescriptive versus descriptive language roles in the construct of dictionaries were debated, concluding that dictionaries are considered historical recordings of language use rather than rule-enforcing entities.

  • Upscaling Video with an AI Eye: Technical discussions arose on the need for temporally-aware models for video upscaling, touching on issues like inconsistent frame details, and referenced OpenModelDB as a resource.

  • WhisperSpeech Flip for TTS: The inversion of OpenAI’s Whisper model to create the Open Source text-to-speech system, WhisperSpeech, was highlighted with a related GitHub repository. Additionally, a discussion on the evaluation of multilingual LLMs and a search for a paper on continuous token embedding methods indicates ongoing research queries and advancements.

  • Unlocking Visual Vim through SSMs: A new arXiv paper introduces Vim, a vision backbone using bidirectional Mamba blocks, found here, while performance analysis of LLMs is explored in another detailed study available here.

LAION Channel Summaries

ā–· #general (78 messagesšŸ”„šŸ”„):

  • DougDoug’s AI Robot Stream - How It’s Done: @ignizherz and others discussed how YouTuber DougDoug managed to create an AI character with NSFW elements which he claims uses ChatGPT. It was mentioned that DougDoug uses the OpenAI API along with ElevenLabs for the voice, and he has open-sourced a similar project on GitHub.
  • Trouble Brewing for Parody and AI: Several users, such as @thejonasbrothers, @chad_in_the_house, and @.undeleted, shared concerns and criticisms over the potentially unconstitutional ā€œNo AI FRAUDā€ Act that could seriously restrict parodies and comedic content based on First Amendment rights. A reason.com article was shared discussing the risks associated with the proposed regulation.
  • Language Rules Debated: A lengthy debate unfolded regarding prescriptive versus descriptive language rules, featuring users like @mkaic, @clock.work_, and @atlasunified. The conversation tackled the fluidity of language and the role of dictionaries, culminating in the recognition that dictionaries are descriptive records, not prescriptive laws.
  • AI Video Upscaling Discussed: @realz inquired about the appropriate tool for upscaling videos without causing inconsistent frame details, which led to a discussion with @pseudoterminalx about the need for temporally-aware upscaling models and the technical aspects of video transcoding. Links and information about available upscaling models were shared, including temporal considerations.
  • Training WhisperSpeech for New Languages: @__._astro_.__ asked about the requirements for training WhisperSpeech on a new language, pointing out issues with current support and higher WER (Word Error Rate) compared to English. No specific details or estimates were provided in the channel regarding the hours of audio needed for such training.

Links mentioned:

ā–· #research (5 messages):

  • Whisper’s Inversion for Text-to-Speech: @helium__ shared a GitHub repository named WhisperSpeech, an Open Source text-to-speech system built by inverting Whisper.

  • New Paper on Vision Backbone with SSMs: @thejonasbrothers provided a link to an arXiv paper discussing a new vision backbone called Vim, which uses bidirectional Mamba blocks for image sequence representation and achieves high performance on various tasks.

  • Authors of LLM Performance Analysis Paper Identified: In another message by @thejonasbrothers, they shared an arXiv paper co-authored by several individuals, showcasing their work related to Long Language Models (LLMs).

  • Inquiry about Continuous Token Embedding Paper: @JH asked for help locating a paper that studies continuous token embedding as opposed to discrete token embedding in LLMs.

  • Evaluation Methods for Multilingual LLMs: @alyosha11 raised a question regarding what evaluation methods make sense for multilingual LLMs in the absence of existing datasets.

Links mentioned:


Eleuther Discord Summary

  • Hypernets, the New Efficiency Frontier?: Discussions ensued about the potential of using hypernets to memorize weight matrices in mixture of experts (MoE) to possibly reduce parameters and boost efficiency, though no conclusive results were shared.

  • Amazon Fuels LLM Research with Resources: Amazon’s call for proposals through the Amazon Research Awards was shared, offering grants and AWS credits to support LLM projects, without being an outright promotion.

  • Evaluating LLMs Across Languages: Conversations highlighted tokenization issues in non-English languages and the lack of datasets for evaluating multilingual LLMs with the dominant use of BLEU for metrics. There was also mention of Self-Rewarding Language Models paper and the Self-Rewarding approach which pushes language models beyond current systems.

  • HELM and Evaluation Harness Differences Explained: The distinction between HELM and evaluation harness was clarified — with evaluation harness dealing with orchestration problems, and HELM outlining methodologies for evaluations. Furthermore, advice was sought on how to organize translated evaluation tasks within the eval-harness framework, which could be placed under a tasks/translations/ directory.

  • Pull Request Alerts for GPT-NeoX Devs: In the gpt-neox-dev channel, a pull request was highlighted for fixing defaults in a Docker container and a unit test for the evaluate function. There are plans to update apex for better Python and PyTorch compatibility, although its build time would require optimization.

  • Robotic Progress and Public Participation Encouraged: Updates on Robot Kyle 2a0a’s training—now at 140 million steps—were shared, and the community was invited to partake by accessing the source code to train their own versions. Participants can view live training sessions of Kyle on YouTube.

Eleuther Channel Summaries

ā–· #general (28 messagesšŸ”„):

  • Exploring Hypernet Efficiency: User @Hawk and @stellaathena engaged in a brief discussion about using hypernets for memorizing weight matrices in mixture of experts scenarios to potentially lower parameters and improve efficiency.
  • Amazon Push for LLM Research: @desik_agi, from Amazon, shared a call for proposals through the Amazon Research Awards, offering grants and AWS promotional credits for LLM projects, and clarified it is not a promotion but an opportunity for those seeking compute resources.
  • Triton Custom Backend Inquiry: User @gabriel_syme is inquiring if anyone has experience with setting up a custom backend server for Triton.
  • LM Evaluation Harness Queries: @hamelh is seeking assistance on utilizing the eval harness to determine which tasks require logprobs and provides a GitHub search link to aid in this understanding.
  • Discussion on Multilingual LLM Evaluation: Users @alyosha11 and @catboy_slim_ are contemplating evaluation metrics and datasets for testing multilingual capabilities in LLMs, with BLEU identified as a standard but datasets being predominantly in English.

Links mentioned:

Build software better, together: GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

ā–· #research (26 messagesšŸ”„):

  • Tokenization Troubles in Non-English Languages: @xylthixlm highlighted challenges with tokenization for non-English languages, with a particular focus on Chinese, Japanese, and Korean (CJK), which @stellaathena confirmed.

  • Mystery of Time-Aware LLMs: @bluerune referenced an unidentified paper or study that suggested LLMs might generate shorter token outputs when they ā€œthinkā€ it’s December versus May, based on a tweet showing statistically significant results.

  • Alphageometry: AI Surpasses Human Mathematicians: @the_alt_man shared a DeepMind blog post about AlphaGeometry, an AI system that successfully solves difficult geometry problems at the level of a human Olympiad gold-medalist.

  • Self-Rewarding Language Models: @pizza_joe introduced a paper on Self-Rewarding Language Models, outlining an approach where a language model uses LLM-as-a-Judge prompting to reward itself, resulting in performance surpassing many existing systems. The statement prompted a discussion led by @xylthixlm and others on the potential of LLMs having sufficient information to achieve higher performance with the right tuning algorithm.

  • The Paradox of Instruction Tuning: @catboy_slim_ and @fern.bear debated the concept of information retention in LLMs during tuning, with a focus on whether it’s truly a loss of information or a failure to specifically direct the model’s output. @catboy_slim_ mentioned LoRA weights as a technique that might mitigate information loss during fine-tuning.

Links mentioned:

  • Self-Rewarding Language Models: We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human prefer…
  • AlphaGeometry: An Olympiad-level AI system for geometry: Our AI system surpasses the state-of-the-art approach for geometry problems, advancing AI reasoning in mathematics
  • Tweet from Rob Lynch (@RobLynch99): @ChatGPTapp @OpenAI @tszzl @emollick @voooooogel Wild result. gpt-4-turbo over the API produces (statistically significant) shorter completions when it ā€œthinksā€ its December vs. when it thinks…

ā–· #interpretability-general (1 messages):

jsai_51448: What is mech interp vs. concept interp vs. dev interp?

ā–· #lm-thunderdome (9 messagesšŸ”„):

  • In Search of Clarification on Harness vs. Helm: @aloo_kachalu sparked a conversation regarding the comparison between evaluation harness and HELM (Holistic Evaluation of Language Models), leading to a discussion on their functionalities and the philosophy behind them.

  • HELM Confusion Unraveled: @stellaathena clarified that the evaluation harness focuses on the orchestration problem of running eval tasks on various models, whereas HELM promotes a recommended methodology for carrying out evaluations.

  • Evaluating Models in Greek: @zoulr shared their journey on evaluating models using Greek tasks translated from English ones like ARC and sought advice on the preferred directory format for language-specific tasks within the eval-harness repository.

  • Organizing Translated Evaluation Tasks: @hailey_schoelkopf recommended that translated tasks could be organized under a specific directory for translations in the eval-harness tasks section, with proposals such as tasks/translations/ or arc_multilingual/.

  • Specific GitHub Pull Request Shared: @hailey_schoelkopf posted a link to a particular GitHub pull request regarding pinning the datasets dependency at 2.15 in the eval-harness repository: Pin datasets dependency at 2.15.

Links mentioned:

ā–· #multimodal-general (1 messages):

  • Robot Kyle Takes a Stroll: technosourceressextraordinaire shared an update on Robot Kyle 2a0a undergoing cooldown training runs on flat ground which might improve its motion on slopes. They mentioned the run is 20 million steps, culminating in 140 million steps, and invited others to access the source code and train their own version of Kyle at NekoCatGame/RagdollTrainer.

  • Training Spectators Welcome: A live training session for Robot Kyle 2a0a is available, showcasing how to train robotic walkers using Unity Machine Learning Agents, which can be viewed on YouTube at Live AI Robot Training.

Links mentioned:

ā–· #gpt-neox-dev (2 messages):

  • A Commitment to Fix: @catboy_slim_ acknowledged a needed fix and expressed an intent to address it soon without specifying the issue.
  • Minor Changes and Fixes in Pull Request: @catboy_slim_ highlighted a pull request that includes minor changes, such as the default output for the Docker container, as well as a fix for a unit test for the evaluate function.
  • Attempts to Optimize Apex: @catboy_slim_ is looking to update the apex version to ensure compatibility with newer versions of Python and PyTorch. However, there’s a challenge with the extended build time for apex, which @catboy_slim_ plans to address by stripping it down in a fork.

Links mentioned:

Minor changes by segyges Ā· Pull Request #1125 Ā· EleutherAI/gpt-neox: Changes default output for docker container Renames docker pythia config to indicate it is docker pythia config Fix unit test for evaluate function


LlamaIndex Discord Discord Summary

  • RAG Gets a Turbocharge in LlamaIndex Hackathon: LlamaIndex ignites competition around Retriever-Augmented Generation with a $8,000 prize pool for their RAG-A-THON Hackathon, urging participants to register for the event to be hosted at DataStax HQ in Santa Clara, CA from February 2nd to 4th.

  • New Course Alert! Vote for LlamaIndex Learning: LlamaIndex intends to craft an online course and is conducting a poll to identify the community’s topic of interest. Community members can voice their preferences in the Twitter poll.

  • Unlocking RAG’s Full Potential with Advanced Queries: LlamaIndex proposes enhancing Retriever-Augmented Generation (RAG) utilizing a query understanding layer; advancements suggested include techniques like HyDE and iterative reasoning. Details on improving RAG can be explored further in their Twitter thread.

  • Community Engineers Tackle LlamaIndex’s Technical Challenges: From effective methodologies to handle large PDFs, intricacies of using metadata in fetching nodes, to technical advice on Azure Key/LlamaIndex integrations, and strategies for summarizing lengthy documents—the engineers shared guidance on various topics. Notable contributions include @whitefang_jr’s metadata query approach and @cheesyfishes’s advice on integrating Azure keys with LlamaIndex detailed in their documentation.

  • Navigating Large Documents with AI Models: In the quest to efficiently work with extensive documents for creating tables of contents and summaries, while ensuring privacy, @takuma.fusioncloud.ai sought community assistance. @greyman_007 recommended exploring the Zephyr model, although specific resources were not provided.

LlamaIndex Discord Channel Summaries

ā–· #blog (4 messages):

  • Exploring Composable Retrieval: LlamaIndex discusses the concept of a composable hierarchy in advanced retrieval systems. Tweet explains linking smaller texts to bigger ones as part of the retrieval process.

  • Interest Gauge in LlamaIndex Course: LlamaIndex is considering creating an online course and is polling for the most important topic users want to learn about. Participate in the poll or specify further in the replies.

  • $8,000 RAG-A-THON Hackathon Announcement: LlamaIndex announces doubling the prize to $8,000 for their first in-person hackathon focused on Retriever-Augmented Generation technology. Register for the event and note that at least one team member must be present at DataStax HQ in Santa Clara, CA from February 2nd to 4th.

  • Enhancing RAG with Advanced Query Transformations: LlamaIndex suggests improving Retriever-Augmented Generation (RAG) by incorporating a query understanding layer, mentioning techniques such as HyDE, sub-question decomposition, iterative reasoning, or routing. Learn more about improving RAG.

Links mentioned:

LlamaIndex RAG Hackathon (in-person only): Think Beyond Chatbots: Unleashing the Potential of AI Agents

ā–· #general (34 messagesšŸ”„):

  • PDF Page Source Tracking: @whitefang_jr advised @alvarojauna to locate information about page numbers in a metadata by printing response.source_nodes to handle a large PDF inquiry.

  • Fetching Nodes by Metadata Query: @whitefang_jr responded to @vozervn by suggesting the use of docstore. Later exchanges imply difficulty in retrieving specific nodes by metadata in PGVector, but @whitefang_jr eventually linked to a relevant section of the LlamaIndex GitHub repo for further guidance.

  • Assistance with Advanced QA Tools Over LlamaIndex: @risk_seeking inquired about third-party tools for QA over LlamaIndex documentation and was seeking recommendations from the community.

  • Azure Key Integration with LlamaIndex: @cheesyfishes helped @zubeen_ resolve an issue regarding the integration of Azure provided OpenAI keys with LlamaIndex by referencing documentation and suggesting the use of AzureOpenAI with a potentially custom httpx client for header management.

  • Challenges with Summarizing Lengthy Documents: @ben25635 sought guidance for summarizing a comprehensive 500-page report, to which @nerdai recommended a hierarchical approach of section-wise summarization before crafting a top-level summary.

Links mentioned:

ā–· #ai-discussion (3 messages):

  • Seeking Assistance with ChatGPT for Large Documents: @takuma.fusioncloud.ai is looking for help on how to utilize ChatGPT for working with large documents to create tables of contents and summaries, as well as maintaining privacy for a collection of 10-12 books.
  • Zephyr Model Suggested for Large Document Handling: @greyman_007 suggests using the Zephyr model with LlamaIndex on Google Colab to handle the task mentioned by @takuma.fusioncloud.ai, but no further details or links were provided.

Latent Space Discord Summary

  • AlphaCodium Debuts on GitHub: AlphaCodium, an open-source code generation tool inspired by DeepMind’s AlphaCode, has been announced and released on GitHub, with details on its flow engineering in a dedicated paper.
  • Karpathy’s Acknowledgment and YouTube Insights: Andrej Karpathy has reviewed and acknowledged AlphaCodium’s capabilities, and further insights can be gleaned from an AI Explained YouTube video.
  • Query on AlphaCodium’s IDE Plugin: Discussions include a question regarding the open-source status of the AlphaCodium IDE Plugin, which is noted to be Apache 2.0 licensed.
  • Meta’s Extensive GPU Deployment Plans: Meta has publicized their aim to deploy an equivalent of 600,000 H100 GPUs by the end of the current year; conversation included talk of GPU availability and prompted a reminder to loop in a key participant with a Tweet link.
  • Gradient Dissent Podcast Recommended for LLM Insight: For those interested in LLM training and deployment, @swyxio highlights a recommendation to listen to a podcast episode of Gradient Dissent featuring Stella Biderman of EleutherAI.

Latent Space Channel Summaries

ā–· #ai-general-chat (27 messagesšŸ”„):

  • AlphaCodium Launches: @itamar_mar announced the official launch of AlphaCodium, an open-source code generation tool that competes in code contests, inspired by DeepMind’s AlphaCode, and invited questions from users. The project has been published on GitHub.
  • Paper Discussion and Inquiry: @slono engaged in a discussion on the paper related to AlphaCodium, probing about the extent of prompt engineering and the effort placed on refining agent steps, resulting in a response from @itamar_mar that 85% effort went into flow design.
  • Tech Community Spotlight: @itamar_mar shared the exciting news that Andrej Karpathy has reviewed their work on AlphaCodium, and @swyxio congratulated them, sharing a link to Karpathy’s Twitter and a relevant AI Explained YouTube video.
  • Tools From the Codebase: @lightningralf inquired about the open-source status of the AlphaCodium IDE Plugin, noting the PR-Agent is Apache 2.0 licensed.
  • Meta’s GPU Arsenal Revealed: @guardiang shared a tweet from @soumithchintala disclosing Meta’s aim to deploy the equivalent of 600,000 H100 GPUs by year’s end, prompting a discussion on GPU availability and @swyxio highlighted someone (@194927177265840128) to loop into the conversation.

Links mentioned:

ā–· #llm-paper-club (1 messages):

  • Prepping for Pythia Paper Discussion: @swyxio flagged an old but informative podcast episode from Gradient Dissent for an upcoming discussion on the Pythia paper, featuring an interview with Stella Biderman from EleutherAI circa 2022. Check it out for insights into LLM training and deployment: Gradient Dissent Podcast.

Links mentioned:

How EleutherAI Trains and Releases LLMs: Interview with Stella Biderman — Gradient Dissent: Exploring Machine Learning, AI, Deep Learning, Computer Vision — Overcast: no description found


LangChain AI Discord Summary

  • LangChain Updates Derailed by Outdated Docs: @daslav flagged that the LangChain documentation is outdated, with issues surrounding the from langchain import hub code. This, along with @Sovok facing an unresolved error with their RAG system and @Behlal encountering issues with the quickstart tutorial retrieval chain on NVIDIA 4090 GPU, suggests a need for documentation review and better error diagnostics.

  • Nesting Knowledge for LangServe: @veryboldbagel discussed advanced usage for nested information within LangServe, advocating for TypedDict and pydantic for precise serialization, as seen in server.py example. This advice aligns with their call to adopt the recently merged astream_event for streaming support in UI, opening possibilities for enhanced interactive systems.

  • API and Frontend Synchronization in the Spotlight: LangServe users, as per insights by @veryboldbagel, should be mindful of the openai_assistant API’s requirement for more complex input beyond simple prompts and look into leveraging the Server-sent Events (SSE) web standard for streaming data on the frontend with references to Mozilla’s SSE guide.

  • Need for LCEL in MapReduce and Search for SQL Interface: Participants in the discussion highlighted the absence of LCEL for MapReduce, brought up by @pramodhgopalan_80290, signaling an impending upgrade in chain language flexibility. Concurrently, @meq__ was recommended a tool named vanna for an open-source natural language to SQL query interface, presenting a potential solution for intuitive data querying.

  • Innovations and Queries in AI Design and Productivity: AI is shaping the design world with neThing.xyz (neThing.xyz) and Langsmith powering the intersection of CAD and generative AI, as @rawwerks seeks feedback. @_anubix initiated a conversation about tools boosting productivity, while @esxr_ praised Olama and Langchain for revolutionizing workflows, inviting peers to their AI-centric blog (esxr.io).

LangChain AI Channel Summaries

ā–· #general (14 messagesšŸ”„):

  • No LCEL for MapReduce Yet: @pramodhgopalan_80290 inquired about the lack of LCEL (LangChain Expression Language) versions for MapReduce and Stuff Summarization, pointing to the documentation which only lists legacy chains. They found information about work in progress to create LCEL versions of all chains for easier modification and native support for streams.

  • Image Retrieval from DuckDuckGo: @solononforever3 asked if it is possible to retrieve images using the DuckDuckGo tool, but did not receive a direct response.

  • Embedding Markdown Data for Chatbots: @xery. is planning to embed over 400 markdown files for a YouTube-based repair guide chatbot and is unsure about the optimal chunk size for embedding each markdown file separately.

  • Searching for Open Source SQL Query Language Interface: @meq__ sought an open-source natural language to SQL query interface and recalled seeing one mentioned in the channel previously. @roi_fosca suggested the name vanna in relation to the query.

  • Out-of-Date Documentation for LangChain: @daslav reported that the LangChain documentation appears outdated, citing code specifically involving from langchain import hub that no longer exists.

  • Repeating Answers Puzzle: @seththunder speculated that the reason for repeated answers in a previous user’s query might be due to streaming the response, though this was in the context of embedding data using a markdown text splitter.

  • Looking for LangSmith Hosting and Enterprise Plans: @muthu1823 requested contact information or advice regarding hosting their own LangSmith environment and inquired about the availability of an enterprise version or pricing.

  • RAG System Error Troubles: @Sovok experienced an unspecified error with their RAG (Retrieval-Augmented Generation) system and shared frustration about not understanding the cause, referencing an inability to open a header file.

  • Quickstart Tutorial Retrieval Chain Issue: @Behlal reported an error while attempting to run the retrieval chain in the quickstart tutorial using Ollama and Llama2 on a system equipped with an NVIDIA 4090 GPU and Ubuntu OS.

Links mentioned:

Chains | šŸ¦œļøšŸ”— Langchain: Chains refer to sequences of calls - whether to an LLM, a tool, or a

ā–· #langserve (7 messages):

  • Nesting with TypedDict and pydantic: @veryboldbagel provided an example of nested information usage in LangServe with server.py. They suggest using TypedDict for more precision and pydantic for object serialization, with guidance to inherit from Custom User Types.

  • Detailed API Implementation Referenced: @veryboldbagel highlighted the openai_assistant API’s requirement of additional information beyond a simple prompt, sharing links to specific implementation examples through base.py at L98 and base.py at L79.

  • RemoteRunnable Client for Svelte Custom UIs: @veryboldbagel discussed the use of Langchain-js’s remote runnable client, providing a link to the API, which facilitates the creation of custom UIs with Svelte.

  • Configurable Runnables and Models: In a message by @veryboldbagel, the use of configurable runnables is explained as part of the LangChain Expression Language, with a prompt to discuss further in langserve for community benefit and better discoverability of solutions.

  • Handling Streaming Data in Frontend: @veryboldbagel responded to @hiranga.g’s query about streaming data to the frontend, suggesting starting with server-sent events (SSE) as a web standard and looking into sample applications using SSE before diving into RemoteRunnable. They shared a Mozilla resource for reference.

  • Streaming Support on Langchain-Core: @veryboldbagel pointed to a recently merged RFC on Langchain-core that introduces astream_event for better streaming support in UI, promising to try adding it to langserve within a week. They provided a discussion link for further details.

Links mentioned:

ā–· #share-your-work (4 messages):

  • neThing.xyz takes shape with Langsmith: User @rawwerks is utilizing Langsmith to facilitate tracing and evaluation of neThing.xyz, a text-to-3D generative AI aimed at CAD & engineering applications. They welcome any feedback on the project, which promises a new way to interact with AI in the field of design.

  • Tools that amplify productivity: User @_anubix queried the community about tools that substantially increase daily productivity.

  • Ollama and Langchain Revolutionize Daily Workflows: @esxr_ shared that Ollama and Langchain have dramatically changed how they work, allowing them to build custom solutions. They’ve also customized the Olama WebUI for their use, which significantly benefits their productivity.

  • AI Enthusiast Blogs About AI Explorations: @esxr_ mentioned their blog esxr.io, where they journal their AI findings and experiences, indicating a particular interest in the broader domains of AI and its applications.

Links mentioned:

  • neThing.xyz - AI Text to 3D Model: AI powered text-to-3D models
  • Pranav Dhoolia: I am an AI enthusiast, keen on exploring the vast yet interesting domain of Artificial Intelligence. I use this blog as a collaborative notepad for my findings

DiscoResearch Discord Summary

  • Neue Deutsch-sprachige Modell on the Block: DiscoLM German 7b, trained on 65 billion tokens and equipped for English, German, and translation tasks, supports RAG and function calling. Questions on performance metrics for DiscoLM German 7b were asked yet no specific benchmarking data was provided.

  • Benchmarking Emotions vs Instructions: In the benchmark_dev discussion, a potential addition of a complex reasoning section was considered to measure both emotional intelligence and complex instruction following. The surprise was shown at the high-ranking of 7b models and a discussion ensued regarding benchmarking criteria focusing strictly on emotional intelligence.

  • Longer Code Snippets Wanted: An observation was made in embedding_dev about performance drops in code documentation retrieval beyond a 512 token limit, suggesting a trial with jina encodings and extended chunk sizes.

  • Axolotl Primes for Polish: Upcoming sharing of training code and configurations for the Axolotl model was discussed in discolm_german, with a note on possible training data/code sharing and RAG-focused collaboration. User-reported glitches on a demo page received prompt attention, underscoring active support and operational intent.

DiscoResearch Channel Summaries

ā–· #general (4 messages):

  • Introducing DiscoLM German 7b: @_jp1_ announced the release of DiscoLM German 7b, a model trained on 65b tokens and designed for English, German, and translation purposes. The model uniquely supports RAG applications and experimental function calling abilities.
  • Check out the live demo: A live demo of DiscoLM German 7b was shared by _jp1_, available at demo.discoresearch.org for hands-on experience.
  • The model gets cheeky!: @devnull0 humorously commented that asking the model ā€œWas geht?ā€ might break it, suggesting that the model has been given playful or complex inputs during testing.
  • Performance Benchmark Inquiry: @cryptossssun inquired about the benchmarking data for DiscoLM German 7b, seeking insights into its performance metrics.

Links mentioned:

ā–· #benchmark_dev (4 messages):

  • Mixtral’s Improved Performance in New Version: @.calytrix responds to @_jp1_ highlighting that Mixtral performs more competently in the latest version as opposed to the first version.

  • Surprise at 7b Models Ranking High: @_jp1_ expresses surprise at the high ranking of 7b models like Beagle compared to Mixtral instruct and requests an example of Beagle’s superior performance.

  • Clarification on Benchmarking Criteria: @.calytrix clarifies to @_jp1_ that while individual question analysis might not be entirely indicative of total performance, the critique section can be insightful. The benchmarks are tailored to assess emotional intelligence strictly, not complex instruction following.

  • Potential Enhancement of Benchmarking Methodology: @.calytrix mentions to @_jp1_ the possibility of adding a complex reasoning section to the test to create a combined score that measures both emotional intelligence and complex instruction following.

ā–· #embedding_dev (1 messages):

  • Code Documentation Retrieval Performance Drop: @sebastian.bodza observed that performance in code documentation retrieval declines after aggressive truncation linked to the 512 token limit. An experiment with jina encodings and longer chunk sizes is to be tried next.

ā–· #discolm_german (13 messagesšŸ”„):

  • Intrigued by Open Source Initiative: @philipmay expressed gratitude for sharing open-sourced work and showed interest in several aspects of the project, posing multiple questions.
  • Axolotl’s Training and Code Reveal on Horizon: @_jp1_ confirmed plans to share both the training code/configuration for Axolotl and a repository with advanced usage examples; however, they noted the necessity for more time to present it cleanly.
  • Training Data Sharing Potential: In response to @philipmay, @_jp1_ revealed that sharing the training data and code, especially concerning RAG, is possible, highlighting involvement from <@1048301853806448680> and mentioning ongoing improvements and potential collaboration.
  • Tackling AI’s Rejection Responses: Addressing @maxidl’s experience of Axolotl emitting a rejection response, @_jp1_ acknowledged efforts to filter these out and encouraged reporting them to enhance future iterations.
  • Demo Page Glitches and Recoveries: @devnull0 compliments the demo page, later reports a Cloudflare Origin DNS error, but @_jp1_ swiftly indicates the issue has been resolved, signaling the page is operational again.

Skunkworks AI Discord Summary

  • Demand for a Functional Dataset: Users @interstellarninja and @yikesawjeez discussed the need for a refined function calling dataset to align with OpenAI’s conventions, emphasizing the compatibility requirement with the OpenAI API for an open-source function caller.

  • Probing LLM Inference Cost Dynamics: While @helium0120 sought data on LLM inference cost trends, @nisten provided a cautionary note on the complexities of cost calculations, flagging potential subsidies by API services as confounding factors.

  • Scrutiny of Lookahead Decoding Method: @nisten provided a critical assessment of the lookahead decoding method, recognizing its limitations but noting its efficacy in specific scenarios like code editing. Contributions included a link to a detailed blog post exploring the method’s use for LLM inference acceleration.

  • Off-Topic Exchange: User pradeep1148 shared a non-technical YouTube video link, which didn’t relate to the technical and engineering discussions of the guild.

Skunkworks AI Channel Summaries

ā–· #general (9 messagesšŸ”„):

  • Function Calling Dataset Challenges: @interstellarninja acknowledged the limitations of existing datasets and expressed the need for a diverse function calling dataset that aligns with OpenAI’s function signatures and calling schema. This would facilitate compatibility with the OpenAI API, making the open-source function caller easily swappable.

  • Function Caller Search Continues: @yikesawjeez realized the limitations of the existing dataset and expressed an intention to look for a more suitable one that matches OpenAI’s needs.

  • Seeking LLM Inference Cost Trends: User @helium0120 inquired about any available data on trends or forecasts concerning the decrease in LLM inference costs over time.

  • Skepticism over LLM Inference Cost Reduction: @nisten commented that inference cost calculations are challenging due to API services potentially subsidizing those costs, casting doubt on straightforward cost reduction trends.

  • Lookahead Decoding Method Evaluated: @nisten critically evaluated the lookahead decoding method, finding it not as effective as claimed except in certain scenarios such as code editing where re-outputting the entire code with small edits is required. Accompanying the discussion, a link to a blog post (Lookahead Decoding: Accelerating LLM Inference) was provided, which offers a deep dive into the method’s approach to accelerating LLM inference.

Links mentioned:

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org:

TL;DR: We introduce lookahead decoding, a new, exact, and parallel decoding algorithm to accelerate LLM inference. Look…

ā–· #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=POgLwYxDGYk


LLM Perf Enthusiasts AI Discord Summary

  • Swapping Models for Smooth Sailing: @jeffreyw128 mentioned that they switch between different GPTs to avoid issues, while @thebaghdaddy is considering running processes without advanced analytics as a workaround.
  • Instruct Model Preferred by User: thisisnotawill indicated using the instruct model from anyscale without additional context.
  • Call for Data Synthesis Insights: @ayenem is looking for resources on productionizing a data synthesis model, but the community response was absent.
  • Mulling Over MLOps Channel: @ayenem suggested the creation of a #mlops channel, which @pantsforbirds found potentially useful despite referring to MLOps as the ā€œbane of my existence.ā€ @jeffreyw128 questioned the necessity of a separate MLOps channel.
  • Azure Filter Toggle Trouble: @thisisnotawill sought help regarding how to disable content filters in Azure, noting the seeming restriction of the feature to internal use; subsequent discussion or resolution was not indicated.

LLM Perf Enthusiasts AI Channel Summaries

ā–· #gpt4 (3 messages):

  • Swapping GPTs to Avoid Annoyances: @jeffreyw128 mentioned that to circumvent certain issues, they opt to use different GPTs.
  • Analytic Workaround Strategy: In response to @jeffreyw128, @thebaghdaddy considered the suggestion and decided to run their processes without advanced analytics as a potential solution.

ā–· #opensource (1 messages):

thisisnotawill: yeah im using the instruct model from anyscale

ā–· #resources (1 messages):

  • In Search of Data Synthesis Wisdom: @ayenem is seeking experiences or resources such as blogs, books, or tools for productionizing a data synthesis model. There were no responses provided in the message history.

ā–· #feedback-meta (3 messages):

  • MLOps Channel Proposal: User @ayenem inquired if others would be interested in a #mlops channel, suggesting there might be community demand for such a space.
  • MLOps: A Likely Read: @pantsforbirds humorously referred to MLOps as the ā€œbane of my existenceā€ but expressed interest in reading helpful posts if a #mlops channel was created.
  • Debating MLOps Channel’s Necessity: In response, @jeffreyw128 asked what types of discussions would be had in a #mlops channel that wouldn’t already fit in the current ones.

ā–· #openai (1 messages):

  • Azure Content Filter Confusion: User @thisisnotawill inquired about disabling content filters in Azure, mentioning that the option seemed restricted to internal use only. No solutions or follow-up discussions were provided in the given history.

Alignment Lab AI Discord Summary

Only 1 channel had activity, so no need to summarize…

imonenext: Does anyone have a Gemini Pro key?


Datasette - LLM (@SimonW) Discord Summary

Only 1 channel had activity, so no need to summarize…

  • PR for Offline GPT-4All Model Usage: @cameron_y opened a pull request to enable offline usage for gpt4all models, addressing an issue where the library would attempt to download a model even if it already exists locally. This fix is detailed in PR #18 on GitHub.

Links mentioned:

fix: allow local models to work without internet connection by hydrosquall Ā· Pull Request #18 Ā· simonw/llm-gpt4all: Motivation Currently, the library tries to download the model even if it already exists locally, which prevents offline use. Fixes #10 , applying a code hint and investigation from @rotterb Changes…


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.