We covered the OpenAI Terms of Service issue in our [most recent podcast](https://twitter.com/latentspacepod/status/1733160841997070683) - but it's usually a nonissue for "open source AI" individual enthusiasts. Different story when it comes to bigcos - here's a remarkably [swift](https://twitter.com/alexeheath/status/1735805122104680675) punishment meted out to ByteDance:

image.png

[TOC]

OpenAI Discord Summary

  • Discussion about hardware for AI with mentions of the new Mac racks being favorable for consumers, and the A6000 being a good value for its price. (@exx1 & @birdetta)
  • Talked about the advantages of using the Bard API for gemini pro and its free quota of 60 queries per minute. Exploration of costs associated with OpenAI’s API and its merits for experimentation. (@thepitviper & @webhead)
  • Shared comparison video of Claude 2.1 and GPT 4 Turbo performance on coding tasks. Conversation included judgments on their respective competences, with GPT4 Turbo outperforming Claude 2.1. comparison video (@rrross)
  • User experience discussions regarding ChatGPT Plus, varied perspectives on GPT-4 and conjecture about the upcoming GPT-5. Mention of an alleged GPT-4.5 contradicted by members, and speculation about a gradual progression to GPT-5 based on a podcast reference. (@auracletech, @exx1, @satanhashtag, @moshymello)
  • Numerous user inquiries related to technical issues with OpenAI products, card payment problems, large JSON translation with ChatGPT, organization account upgrade difficulty, accidental chat archiving, and troubles with password resetting. Moreover, clarifications about server name relevance towards ChatGPT usage limits, and query on what constitutes one GPT-4 use were also brought forward.
  • Further haze surrounded the Alpha feature’s status due to unanswered inquiries. Also, a similar lack of responses for queries relating to GPT-4 model usage in GitHub Copilot.
  • Noted problems with broken images, uncertainty surrounding the release of GPT-5, dissatisfaction expressed about the reduced number of messages in ChatGPT Plus Membership.
  • Debates about the policy on AI art generation, the importance of diverse people depiction in AI-generated scenes, and a query on limiting output tokens in a chat completion call. Additionally, the noteworthy improvement in an organization’s work features.

OpenAI Channel Summaries

▷ #ai-discussions (48 messages🔥):

  • Mac Racks and A6000 for AI: @exx1 and @birdetta had a brief discussion about new Mac racks being advantageous for consumers, and the A6000 being decent for its price, respectively.
  • Gemini Pro and Bard API Usage: @thepitviper mentioned the benefits of the Bard API for gemini pro being free for 60 queries per minute and how it encourages usage. He also discussed that despite the openai’s API cost, it’s fun to experiment with. @webhead countered the cost issue, noting that it’s relatively cheap and he goes through a million tokens for just under 40 a month.
  • Alpha Feature Confidentiality: @wintre asked about the status of the Alpha feature, whether it’s still confidential or removed.
  • Comparison of Claude 2.1 and GPT 4 Turbo for coding: @rrross shared a brief comparison video of Claude 2.1 and GPT 4 Turbo for coding tasks. The discussion continued with various insights, like Claude 1 being rated higher than Claude 2 and 2.1. He also compared results between Claude 2.1 and GPT4 Turbo where GPT4 Turbo performed better.
  • Functioning of Bard and Claude 2: @eljajasoriginal noted that Bard seems to be hallucinating less and performs better than Claude 2, especially for tasks requiring internet search. However, he mentioned that with a decrease in the number of requests one can make, he hardly uses Claude 2 anymore.

▷ #openai-chatter (117 messages🔥🔥):

  • ChatGPT Plus Membership and Payment Issues: @mrcrack_ experienced issues with his ChatGPT Plus subscription not being recognized. It seemed his card had issues but they were resolved when he changed to a different one. Another user @5sky had problems with his card getting declined for ChatGPT services.
  • Experience and Expectations of GPT-4 and GPT-5: There was a debate among members like @auracletech, @exx1, and @auracletech regarding the performance of GPT-4 and expectations for GPT-5. Some users feel that GPT-4 was underwhelming, whereas others appreciated its feature advancements. Also, there were rumors and discussions about a supposed GPT-4.5, but @satanhashtag and @exx1 clarified that these were based on fake screenshots, whereas @moshymello mentioned an indirect reference to a gradual progression to GPT-5 by Sam Altman in a podcast.
  • User Experience with ChatGPT: User @yethaplaya sought advice for getting OpenAI’s models to generate more complex code. User @smoosh_laboosh noted that DALL-E 3 only provides one image per prompt. @arrushc inquired about a plugin to better organize and filter conversation history with ChatGPT. @jeweis mentioned an error message while using GPT3.5 despite being a ChatGPT Plus member.
  • OpenAI API and Voiceflow: @zevv_. asked for resources to learn how to connect and use the assistant API with Voiceflow. @Teemu suggested YouTube tutorials by Voiceflow and their pre-built template.
  • Converge 2 Inquiry: @soulztundra asked if anyone knows the age requirement or institutional requirements for Converge 2.

▷ #openai-questions (30 messages🔥):

  • Tier upgrade issue: @superseethat expressed concerns over their organizational account not upgrading to the third-rate limit tier despite an increase in usage, charged account, and submitted issues, seeking a solution around the issue.
  • Unarchiving archived chats: @martinrj sought help to unarchive chats accidentally archived, with @lumirix pointing them to the ‘Archived Chats’ section under ‘General’ in ‘Settings’.
  • Translating a large JSON file with Chat-GPT: @youseif asked if there is a way to translate a big JSON file using Chat-GPT.
  • Password reset problem: @marnold0015 asked for assistance as the reset password email wasn’t appearing in their email box, with @satanhashtag recommending the usage of help.openai.com and sending an email to the support department.
  • Prompts for writing larger codebases: @yethaplaya looked for good custom instructions to deal with longer and more complex code, seeking to bypass the bot’s default reluctance to generate such code.
  • Unusual system activity error: @jeweis mentioned an error message about unusual system activity when using GPT3.5, though ChatGPT4.0 worked fine.
  • ChatGPT 4 or 3.5 usage limit: @kingchengge asked whether the server names as ‘ChatGPT 4’ or ‘3.5’ matter in regards to the limit of 40 messages/3 hours.
  • GPT usage specification: @solbus clarified that all custom GPTs are GPT-4 and count toward the user’s cap, sharing a link that specifies what counts as one GPT-4 use.
  • Billing inquiries: Both @avnova and @theblack.sage brought up potential billing issues, especially upgrading to the next tier and general billing assistance.

▷ #gpt-4-discussions (28 messages🔥):

  • Broken Images Issue: @wehodl experienced an issue with broken images and shared the forum discussion for the same. They didn’t manage to find a resolution yet.
  • Query About GPT-4 Model in GitHub Copilot: @blueberryberry asked if GitHub Copilot is using the GPT-4 model. A direct answer was not provided in the reported messages.
  • Concern about Alpha Feature: @wintre questioned about the status of the Alpha feature, whether it’s still confidential or removed. This remained unanswered in the given messages.
  • Discussion About GPT-5 Release: Users like @sinize, @pudochu and @sxr_ discussed about the possible release of GPT-5. General consensus suggests its release might not happen soon due to issues with current models such as GPT-4 Vision and Turbo and potential challenges in providing optimal service to numerous users. No official announcement or link was given in this regard.
  • Concern Over Message Limit in ChatGPT Plus Membership: @pudochu indicated some dissatisfaction over a decrease in the number of messages allotted every 3 hours in ChatGPT Plus Membership, the limit dropping from 50 to 40 messages.

▷ #prompt-engineering (5 messages):

  • Limiting Output Tokens of a Chat Completion Call: @slobby_knobby_corny_cobby asked for suggestions on how to limit the number of output tokens from a chat completion call through an API, without the message being cut off.

  • AI Art Generation & Attribution Policy: @mysticmarks1 voiced concerns about the policy on generating AI art. They argued that the rule to not create images in the style of artists whose latest work was post-1912 creates bias and suggested a workaround by substituting the artist’s name with three adjectives that capture the key aspects of their style, attaching an associated artistic movement or era for context, and noting their primary medium.

  • Diverse Representation in AI Generated Scenes: @mysticmarks1 critiqued the policy to diversify depictions of people in AI generated scenes by including descent and gender for each person. They debated that such a rule could misrepresent reality, questioning, for example, its applicability in a scene from an ancient Aztec kitchen.

  • Addition of New Features to an Organization’s Work: @mysticmarks1 shared that they had significantly expanded the features of an unspecified organization’s work, adding robust capabilities.

▷ #api-discussions (5 messages):

  • Limiting Output Tokens in API Call: @slobby_knobby_corny_cobby asked for suggestions on how to limit the number of output tokens from a chat completion call via an API without the message cutting off.

  • Concerns about Image Generation Policy: @mysticmarks1 expressed concerns about current policies in place for image generation, citing potential issues with the guidelines for recreating styles of artists and diversifying depictions of people. A particular point of contention is the policy that restricts recreation of styles of artists whose latest work was post 1912.

  • Description Diversity and Realism: @mysticmarks1 pointed out a perceived contradiction in policy guidelines, noting the challenge in balancing realistic representations (e.g. not having all members of a given occupation be the same gender or race) with the need for diversity in descriptions. It was questioned how this rule would apply to historical or contextual scenarios such as an “ancient Aztec kitchen”.

  • Revised Work: @mysticmarks1 claimed to have rewritten and expanded upon the organization’s work to introduce robust features and capabilities, implying dissatisfaction with the current guidelines. There is no specific information provided about the nature of these improvements.


Mistral Discord Summary

  • Mistral AI and Open Source Models: Active conversations regarding the effectiveness, choices, and usage of various models offered by Mistral AI. References were made to the Medium and Mixtral models, along with supporting links to the discussions, Twitter and GitHub. A specific focus was placed on parameters tuning within Mistral’s performance.
  • Building a Chatbot UI: Suggestions for open-source solutions for chatbot development were posted, providing links to Chatbot-UI and OpenAgents and their respective GitHub repositories.
  • Comparison of AI Models Performance: Comparisons were drawn between various AI models, including theMedium and Mixtral models. An event, “AI Hack Night”, was organized to explore the application of the Mistral’s Language Learning Model with the Assistant’s API. Mentioned the perceived decline in benchmark performance due to intricate version-controlled API system requirements.
  • Fine-tuning within Mistral AI: Discussions on fine-tuning capabilities explored whether the current API supported fine-tuning. The question was addressed, indicating that the current API does not have fine-tuning offerings.
  • Contributions and Projects: Links to a Nuget package for using the Mistral AI platform in .NET were shared. Calls for collaboration on a project were made, and an open attitude towards ideas and modifications was expressed. Discussions also covered exploration of making calls to the Mistral API.
  • Quantization and Platform Integrations: An explanation of the process of quantization was offered. Some users expressed desire for a feature allowing token usage tracking, with confirmation of such a feature being in the works. Questions were raised about platform integration and the number of experts employed by the Mistral Medium API.

Mistral Channel Summaries

▷ #general (180 messages🔥🔥):

  • Mistral API and Open Source Models: Users discussed Mistral API and open-source models@lee0099, @cyborgdream, and @marketingexpert debated their preferences and options for using different models via the API, including the Medium and Mixtral models. Link to discussion
  • Creating a Chatbot UI: @cyborgdream suggested open-source UI solutions for developing a chatbot, providing links to the GitHub repositories of Chatbot-UI and OpenAgents.
  • API Usage and Waitlist: User @marketingexpert expressed eagerness to use the Mistral API and voiced frustration over the waitlist. @tlacroix_ explained the careful process of load management to ensure a good user experience.
  • Mistral Models Performance: @jamsyns shared his experience of finding Mistral-medium more efficient at coding than ChatGPT 4 and provided a Twitter link for reference.
  • AI Hack Night using Mistral LLM with Assistants API: @louis030195 organised an AI hack night event focused on using Mistral’s Language Learning Model (LLM) with Assistant’s API. A link to the event was shared.

▷ #models (10 messages🔥):

  • Mistral-specialize in Non-human lines: @reguile commented that Mistral’s experts seem to specialize along non-human lines such as nouns and punctuation.
  • Question about Mistral’s Fine-Tuning: @pdehaye asked if Mistral’s 8 experts are fine-tuned using human-curated datasets, inquiring about the human logic behind this decision.
  • Performance Tuning in Mistral: @svilupp found through experimentation that altering top-p and temperature settings can nearly replicate GPT-4 performance, observing nearly a 30-point increase from default parameter settings on their benchmark. However, they noted a major performance drop after the Mistral-medium API was seemingly changed, affecting their expected benefits.
  • Performance comparison: @svilupp further mentioned that while they like Mistral’s responses, there is a noticeable difference in consistency and handling of more complex prompts when compared to [GPT-3.5-turbo-1106](https://github.com/svilupp/Julia-LLM-Leaderboard), and even Mistral-small.
  • Version Control Challenge in LLM: @casper_ai reinforced the difficulty in serving large language models due to the need for an intricately version-controlled API system for reproducible results. This requirement, however, comes at a cost, leading to possible decline in benchmark performance.

▷ #ref-implem (1 messages):

  • Integration of Jinja Templates in Projects: @titaux12 expressed an interest in having a “working-out-of-the-box” Jinja template for facilitating projects that run multiple LLMs seamlessly. This request comes in the context of work on supporting multiple models within privateGPT, where Jinja templates are seen as an efficient and elegant way to manage varying prompt formats as required by different LLMs. Relevant work is outlined in a GitHub pull request.
  • Comparison with vLLM Works: @titaux12 mentioned that similar work is being undertaken by the vLLM team, who are perceived as being more advanced in their endeavors as they already employ these files.
  • Model Output Format Compliance: @titaux12 indicated they will adapt to ensure their model delivers the output format required by the other user’s model, namely maintaining the integrity of the provided token representation, with 1 (BOS) as the first token and avoiding duplication or encoding as a different token.

▷ #finetuning (3 messages):

  • Fine-tuning Capabilities in Current API: User @jamiecropley raised a question about whether fine-tuning can be done in the current API. @lerela confirmed that the current API does not have fine-tuning capabilities, and encouraged users to share their use cases with the support email for Mistral AI.

▷ #showcase (10 messages🔥):

  • .NET library for the Mistral AI platform: User @vortacs shared a link to the .NET library he developed for the Mistral AI platform, with both the list models and chat endpoints implemented. The package is available on Nuget.
  • Looking for Collaborators: User @tonic_1 expressed interest in getting collaborators for a project, and is also open to others using the idea and making modifications.
  • Contributions on Github: Both @aha20395 and @fayiron shared that they have not done programming or contributed to anything on Github for a long time. Despite that, @aha20395 expressed his support and willingness to help in non-coding aspects.
  • Explored Mistral API: @tonic_1 mentioned that he is at the early stage of exploring how to make calls to the Mistral API.

▷ #la-plateforme (17 messages🔥):

  • Comparison between Mistral-tiny and GGUF Mistral-7B-Instruct-v0.2 performance: User @tarruda finds the performance of Mistral-tiny to be significantly superior to the Q8 GGUF version of Mistral-7B-Instruct-v0.2 on huggingface. The tests were performed using temperature 0 on both and the prompts can be found in this benchmark link.
  • Discussions on Quantization: Users @someone13574 and @titaux12 explained the basic idea of quantization. According to them, in llama.cpp, Q8_K is done by grouping parameters into groups of 256 and finding a weight_scale for each block. Weights are stored as 8 bit integers which are multiplied by the scale to get the original weights (with some errors).
  • Request for Usage Tracking: User @phinder.ai requests for a feature to track the usage of tokens. @tlacroix_ confirms the addition of this feature in the next week.
  • Questions about Platform Integrations: @subham5089 inquires about llamaindex integration with the platform.
  • Question about the number of experts in the Mistral Medium API: User @timotheeee1 asks whether the mistral medium API uses 2 or 3 experts. They speculated that increasing to 3 experts might improve the model’s performance.

Nous Research AI Discord Summary

  • Discussions around AI model choice and preference: Users shared their approaches and preferences for AI models and embeddings, such as GTE-small and Jina. An upcoming option to embed the entire Arxiv database was referenced, although concerns about the effectiveness of mixing embeddings were expressed. Fine-tuning and interview models were also discussed, outlining the potentials and challenges of different model training strategies.

  • Release of user-generated AI models: The Metis 0.1 model for reasoning and text comprehension was released on huggingface, clauses about downloading, quantizing, and evaluating the model were also shared. The first Phi 2 GGUF surfaced and was announced.

  • Acquisition and application of GPU resources: The discussion featured interest in acquiring MI300X GPUs, and a blog on how to run Nvidia SXM GPUs in consumer PCs was linked.

  • Interest in Peer-to-peer (P2P) and Distributed Compute Networks: Several resources were shared, including Petals, Bacalhau/Filecoin, Bittensor, Hyperspace, Akash, and Shoggoth Systems.

  • Updates on AI industry news: It was noted that OpenAI suspended ByteDance’s account for violating the developer licenses, and Microsoft’s announcement about the public availability of GPT-4 Turbo with Vision on Azure OpenAI Service was raised for discussion.

  • Inquiries on fine-tuning and operational aspects of open models like Mistral and Openhermes, with specific inquiries about recovering logprobs and measuring text coherence. Conversations showcased the dynamic nature of fine-tuning Models, considering things like using a <backspace> token or models providing self-correction. The Mamba chat repository and related research papers were shared in the discourse on state. space model architectures.

Nous Research AI Channel Summaries

▷ #off-topic (21 messages🔥):

  • Choice of AI Model and Approach: User @adjectiveallison expressed confusion over the several available AI models and inference options. @lightningralf recommended waiting if possible, mentioning an upcoming option to embed the entire Arxiv database. However, @lightningralf also expressed concerns about the effectiveness of mixing embeddings.
  • Differing Model Uses: @natefyi_30842 shared their preference for using the gte-small model and suggested that larger embeddings like Jina are better for larger projects, such as books.
  • Fine-Tuning and Interview Models: @a.asif is considering fine-tuning a model to act as an interviewer, asking questions based on provided scenario cards. @.beowulfbr suggested using system prompts or the RAG method if the scenarios fit within the context window. However, @a.asif believes that due to a large amount of scenarios, fine-tuning might be a better approach and would lead to a model suited for a specific use case.
  • GPU: @dragan.jovanovich inquired about avenues to purchase MI300X GPUs.
  • Embedding Techniques: @nikhil_thorat shared their approach: using gte-small for sentence/page level and Jina for full document embeddings, highlighting Jina’s effectiveness for clustering.
  • Metis 0.1 Model Fine-tune: User @mihai4256 announced they’ve created a 7b fine-tuned model, Metis 0.1, for reasoning and text comprehension, which is now available on huggingface. The model isn’t aimed for use in story telling, but rather for reasoning and text comprehension tasks. It’s trained on a private dataset and not the MetaMath dataset.
  • Downloading & Quantizing Metis 0.1: User @.benxh confirmed downloading and quantizing the Metis 0.1 model for testing. They expressed a preference for Q6 for 7B due to its high quality and extremely fast performance, while @mihai4256 affirmed using Q8 GGUF due to its large sampling speed when running on VRAM.
  • Adopting Nvidia SXM GPUs for Consumer PCs: @giftedgummybee shared a blog post on how to run Nvidia SXM GPUs in consumer PCs and third party servers as alternatives to consumer-grade GPUs, which have now become able to match datacenter GPU performance.
  • Use of ChatML for Inference Libraries: User @.benxh suggested the use of ChatML for inference libraries in future, stating it’s easier to use and makes models more accessible.
  • First Phi-2 GGUF: @tsunemoto announced the surfacing of the first Phi 2 GGUF and that it’s a quantized version of Microsoft Phi-2 to 4_0, 8_0 bits and converted to a GGUF 16FP model. However, he clarified that it won’t work unless a specific fork with modifications is used. @nonameusr expressed their intention to test the model.

▷ #general (41 messages🔥):

  • P2P and Distributed Compute Networks: @yikesawjeez shared a variety of resources and information about peer-to-peer (P2P) and distributed compute networks including Petals, Bacalhau/Filecoin, Bittensor, Hyperspace, Akash, and Shoggoth Systems.

  • Issues with ChatGPT: .interstellarninja mentioned problems with ChatGPT similar to those previously encountered with Llama-2, specifically it failing to help with terminating processes.

  • Experimentations with Mixtral: @nonameusr reported that they have been testing Mixtral and finding it entertaining.

  • Comparison of Models: @shane_74436 asked @879714655356997692 for comparative metrics on the base mixtral, mixtral’s instruct version, and the model they built. They noted mixed performance on The Gradient Institute (TGI), but were unsure if the platform, their prompting, or the model was responsible.

  • News Updates: @atgctg shared an article from The Verge noting that OpenAI has suspended ByteDance’s account for using GPT-generated data in violation of Microsoft and OpenAI’s developer licenses. @lightningralf commented that this might prompt Chinese buyers to turn to Mistral instead. @giftedgummybee shared a tweet by Greg Brockman and expressed their confusion about the paper it discussed, and later flagged a Microsoft announcement about GPT-4 Turbo with Vision being publicly available on Azure OpenAI Service and asked @550656390679494657 if they had seen it.

  • Request for help with Fine-tuning Open Models: @realsedlyf asked if anyone had resources or ideas about fine-tuning open models like Mistral and openhermes, specifically for function calling datasets.

▷ #ask-about-llms (11 messages🔥):

  • State. Space Model Architectures: @vincentweisser initiated a discussion on state. space model architectures like Mamba and shared a link to the github repository of Mamba, a chat Language Learning Machine (LLM) that uses a state-space model architecture. In addition, he also shared a related research paper on the topic.
  • Fine-tuning Models: @jason.today proposed the idea of fine-tuning models to give them the inherent capability of correcting their outputs, such as re-reading and correcting themselves and giving generalized commands like inserting and deleting. However, @atgctg countered that this might make the model always output incorrect tokens at first.
  • token: In the fine-tuning discussion, @atgctg mentioned a related research paper that introduces the use of <backspace> token.
  • Availability of Compute Resources: @yikesawjeez offered to share computing resources they recently acquired, specifically targeting open-source projects or endeavors contributing to humanity or creating quality memes.
  • Recovering Logprobs from Open Source Models: @adjectiveallison inquired about the standard approach to recover logprobs returned from open source models from specific inference providers or locally, to which @atgctg responded that the platform Together supports this feature.
  • Measuring Text Coherence: @xela_akwa asked if there are ways or benchmarks to measure the coherence of a text.

OpenAccess AI Collective (axolotl) Discord Summary

  • Continued dialogue around the fine-tuning of Phi-2 Base, with user @noobmaster29 sharing a tweet suggesting its potential for further development. This resulted in discussions about other preferred Language Learning Model (LLM) bases, including Mistral 7B and Hermes.
  • Detailed discussions on the technical aspects and performance of AMD’s new product, the MI300x, and its competitive edge over Nvidia’s H100 in terms of inference speed, shared by user @casper_ai. This sparked further conversation about the potential extension of ROCm support to AMD’s consumer cards.
  • Group interaction around troubleshooting with HF Transformers, with solutions for issues shared within the community. Notably, updates to NCCL and Nvidia were reported to resolve some hangups during training. A PR request for Transformers was opened and shared within the guild.
  • Active discussions about different Training Configurations, with insights about FFT of mixtral training revealed. Further explorations on potential implementation of fusedmlp for Mixtral’s experts surfaced, with proposed memory savings being a noticeable topic of interest.
  • Consultations on certain issues such as the functioning of a pod and the double EOS token issue. Searches for solutions to stop unnecessary run of containers after a self-contained training task was also a pressing topic, with proposed ways of using Runpod’s native API keys and suggested implementations of callbacks for task termination. A question about viewing images without a base URL was raised but remained unanswered.

OpenAccess AI Collective (axolotl) Channel Summaries

▷ #general (8 messages🔥):

  • Phi-2 Base for Fine-tuning: @noobmaster29 shared a tweet by Sebastien Bubeck, which suggests that phi-2 is really a good base for further fine-tuning. He details that they fine-tuned on 1M math exercises and tested on a recent French nationwide math exam with encouraging results.
  • Preferred LLM Base Model: @jovial_lynx_74856 surveyed the collective’s preferred Language Learning Model (LLM) base, mentioning Mistral 7B and Hermes as options.
  • AMD Vs Nvidia Inference Speed Competitions: @casper_ai notified the group of AMD’s new product, the MI300x, which allegedly outperforms Nvidia’s H100 on inference speed while costing 50% less. He also included a link to the relevant AMD community post.
  • ROCm Support for AMD Consumer Cards: @le_mess expressed desire for ROCm support to extend to AMD’s consumer cards.
  • Mixtral Medium Benchmarks: @dangfutures inquired if there were any benchmarks available for Mixtral Medium. @le_mess responded that some benchmarks are available on the company’s website.

▷ #axolotl-dev (49 messages🔥):

  • Troubleshooting with HF Transformers: User @hamelh mentioned having an issue with HF Transformers. @caseus_ provided a link to a PR on Github that was reported to fix the issue.
  • Updates on NCCL and Nvidia: User @richbrain shared that after updating NCCL to 2.19.3 and Nvidia to 23-10, they no longer experienced hangups after 1 hour of training time. The update was well-received by the community.
  • PR Request for Transformers: @caseus_ requested for a PR to be opened, with the new Transformers version pinned in the requirement files. @richbrain fulfills the request, offering a link to the PR.
  • Discussing Training Configurations: @yamashi and @casper_ai engaged in discussions with @richbrain on their FFT of mixtral training configurations. Richbrain revealed they were running on 2 nodes with H100 GPUs, and a sequence length of 2048.
  • Potential Implementation of FusedMLP:@caseus_ and @casper_ai discussed projected memory savings of implementing fusedmlp for Mixtral’s experts. Casper gave an estimate of 8GB VRAM in memory saving, based on 1GB memory saving observed in Llama 7B MLP.

▷ #general-help (4 messages):

  • Pod Running Issue: @dangfutures inquired about potential issues with the functioning of a pod, unsure whether it is buggy.
  • Double EOS Token Solution: @noobmaster29 suggested a solution to fix the double EOS token issue. The proposed solution involves making changes in the sharegpt.py file. Specifically, removing the ‘sep style’ and adding a `stop_str=”

▷ #runpod-help (6 messages):

  • Using RUNPOD_API_KEY and RUNPOD_POD_ID to control Runpod: @caseus_ explained that Runpod provides environment variables (RUNPOD_API_KEY and RUNPOD_POD_ID) which can be used to make API calls to control the pod, potentially to shut it down. They also posted a link to Runpod’s “Stop Pod” documentation and shared an example curl command, but noted that “stop” isn’t the same as “terminate”.
  • Configuring Runpod container to exit after task completion:@_jp1_ expressed a desire for a Runpod container to exit as soon as a self-contained training task was complete, as Runpod’s default behavior appears to be to restart the container indefinitely. Because running the container can be costly, they were looking for a solution that wouldn’t require an external server or service to monitor the job and stop the container.
  • Suggestion for Automatically Stopping Runpod After Task Completion: In response to _jp1_, @caseus_ suggested setting up a callback at the end of training that would shut down the container once the task was complete. This could possibly involve a timer to make sure the final checkpoint is uploaded before shutdown.
  • Viewing Images Without a Base URL: @mustapha7150 asked how to view an image when the output did not contain a base URL. This question wasn’t answered in the provided messages.

HuggingFace Discord Discord Summary

  • Addressing technical issues involving AI model training, Gradio’s API server, and FastAPI execution with KoalaAI/Text-Moderation. Specific references include a HuggingFace’s tutorial and a GitHub issue related to the FastAPI error.
  • Active dialogue about AI model performance and implementations including use cases, performance characteristics and comparisons. Models discussed extensively include the cogvlm model, Phi-2, deepseek-coder, and go-bruins-v2.1.1.
  • Shared projects and demonstrations, like the monetized speech-to-text app using a Web3 API gateway YouTube link, Diffusion-Cocktail’s demo and a link on generating hyper-realistic faces with AI.
  • Solicitation of help and advice on projects, such as converting PHP projects to Laravel using models like CodeGen2.5 and StarChat Beta, converting Animate Diff .ckpt files to .bin files and creating datasets from raw song vocals and timed lyrics.
  • Spotlight on the importance of hardware resources, especially a high VRAM GPU, for successful model inference.

HuggingFace Discord Channel Summaries

▷ #general (26 messages🔥):

  • Diffusion Models Training Issue: User @everythinging reports generated noise images when attempting to train on their own images following a tutorial on HuggingFace’s official tutorial. They question if there is a minimum size requirement for the dataset.
  • API Connection Issue: User @paghkman identified that Gradio client library’s API link is currently down and returning an error.
  • Discussions on Multimodal models: Users @doctorpangloss and @asrielhan praise the effectiveness of the cogvlm model as a multimodal model. Moreover, user @asrielhan also mentioned another new model here.
  • Discussion on Training Models: @vipitis shared their experiences with model Phi-2, stating it performs well only on the domains it’s trained on and shared a comparison here.
  • Code Conversion Project: @pyr0t0n asks about a project aimed at converting complete programming projects from PHP to the Laravel Framework for PHP. User @merve3234 responds by suggesting to check out code instruction models like CodeGen2.5 and StarChat Beta.

▷ #today-im-learning (1 messages):

nixon_88316: Hello! Is there anyone to add any model to stabilityAI?

▷ #cool-finds (1 messages):

  • Generating Hyper-Realistic Faces with AI: User @kingabzpro shared a link to a blog post from KDNuggets detailing three ways to generate hyper-realistic faces using AI, mainly through prompt engineering, the Stable Diffusion XL model, and a custom model from CivitAI. The post offers solutions to those struggling with generating quality AI images that often end up full of glitches and artifacts.

▷ #i-made-this (7 messages):

  • Model Generation Improvements: @vipitis reported improved results after fixing the post-processing for model generation. They praised the performance of deepseek-coder, expressing eagerness to run larger variants of it.
  • Web3 API Application: @dsimmo designed a monetized speech-to-text application using a Web3 API gateway. They shared a YouTube link to inspire others to create similar applications.
  • New Model Release: @rwitz_ announced a new model release, go-bruins-v2.1.1, claiming it to be the top-rated model of any size on the leaderboard as of 12/16/2023. The model was DPO-trained on Intel/orca_dpo_pairs.
  • Image Content/Style Manipulation Method: @.ricercar proposed a training-free method for image content/style manipulation. They shared a demo illustrating the integration of pre-trained DMs and LoRAs.
  • Demo Feedback and Troubleshooting: User @jo_pmt_79880 experienced issues with the image output in .ricercar’s demo, with images coming back empty. @.ricercar asked the user to refresh the browser and try again.

▷ #reading-group (1 messages):

pcuenq: That’d be great <@582573083500478464>!

▷ #diffusion-discussions (2 messages):

  • Convert Animate Diff .ckpt to .bin files: @jfischoff asked if there’s a way to convert the Animate Diff .ckpt file to .bin files that diffusers motion adapters understand.

▷ #computer-vision (1 messages):

vision12: Hey there, does anyone know how can i convert timm model weights to keras weights

▷ #NLP (3 messages):

  • GPU Requirements for Inference: User @nerdimo suggested that for the inference process, a strong GPU with a high amount of VRAM is essential, otherwise inference could be problematic.
  • FastAPI Execution Error with KoalaAI/Text-Moderation: User @marshy.dev reported an issue when running KoalaAI’s Text-Moderation model through FastAPI, facing a RuntimeError. Shared link to a related GitHub issue regarding the error and provided their current approach code that is causing the error.
  • Creating Datasets from Song Lyrics with Timestamps: User @sushi057 asked for help in creating datasets from raw song vocals that are to be mapped to timed lyrics to aid their model.

▷ #diffusion-discussions (2 messages):

  • Conversion from .ckpt to .bin for Diffusers Motion Adapters: User @jfischoff inquired about possible methods to convert the Animate Diff .ckpt files into .bin files that diffusers motion adapters can understand.

LangChain AI Discord Summary

  • Users engaged in various needs for technical assistance, with requests including advice on deleting documents from a Faiss store in Node.js, estimating the costs of hosting a RAG application with AWS, and inquiring about older versions of the LangChain API. Current avenues for help include an invitation from @holymode helping @quantumqueenxox convert synchronous to asynchronous code.
  • An imminent OCR feature in Vectara, possibly beneficial for LangChain customers was mentioned by @ofermend. Early access to the new feature was offered through a waitlist link.
  • LangChan-ai’s GitHub repository was reported by @arborealdaniel_81024 as a broken link when checking the release notes template.
  • @_johnny1984 sparked a conversation on potential security risks with AI, possibly related to a disclosed association with OpenBSD. They also revealed a willingness to provide Ruby examples if needed.
  • The community saw updates from two projects. @discossi shared a web scraping feature for an AI chatbot on their llama-cpp-chat-memory GitHub repository, and @heydianaa announced a discord bot transforming images to dance videos, open for feedback at kineticpix.ai.

LangChain AI Channel Summaries

▷ #general (11 messages🔥):

  • Deleting Documents from Faiss Store in Node.js: @binodev asked how to delete documents from a Faiss store in Node.js. No solutions or further discussion has been provided yet.
  • RAG Application Costs with AWS: @legendary_pony_33278 sought advice on estimating costs of hosting a RAG application on AWS. They’re currently using Mixtral 8x7B or Llama 2 13b models along with a small vectordb. No suggestions have been provided yet.
  • OCR Capability with Vectara: @ofermend announced an upcoming OCR feature in Vectara that could be useful for customers integrating it with LangChain, especially for processing scanned documents. They shared a waitlist link for those interested in early access.
  • Inquiry about Older LangChain API: @damianj5489 was looking for an older version of the API for LangChain but could only find the newest version (0.0.350) through the provided link. No responses yet.
  • Using AI to Help in Legal Matters: @_johnny1984 proposed using AI to help his girlfriend win a custody battle, though she was sceptical. No feedback or suggestions were given yet.
  • Request for Help with Asynchronous Code: @quantumqueenxox asked for assistance in converting their code to an asynchronous pattern, offering payment for this service. @holymode expressed willingness to help, though the details are being discussed privately.

▷ #langchain-templates (1 messages):

▷ #share-your-work (5 messages):

  • Security Concerns: User @_johnny1984 initiated a discussion about the security of AI, speculating about the risks and implications of an AI capable of hacking systems.
  • OpenBSD AI Development: _johnny1984 also disclosed his affiliation with OpenBSD, suggesting he is working on creating a hacking-enabled AI.
  • Web Scraping in AI Chatbot: @discossi shared an update about adding basic web scraping functionality for document memory in his AI chatbot. He provided a link to the GitHub repository for the project, llama-cpp-chat-memory.
  • Pic-To-Dance-Video Bot: @heydianaa has developed a free-to-use discord bot that can convert pictures into dancing videos. They encouraged users to register, try it out, and provide feedback at kineticpix.ai.

▷ #tutorials (2 messages):

  • Ruby Examples Offer: User @_johnny1984 offered to provide Ruby examples upon request.

Alignment Lab AI Discord Summary

Only 1 channel had activity, so no need to summarize…

  • Discussion on Phi-2 Training Efficiency: Members were curious about how Microsoft trained its model Phi-2 so efficiently. @autometa claimed the model is heavier on GPUs than Llama 13b and brought up the confusing efficiency claim of using 90-something a100s for 20 days.
  • Speculations about Phi-2’s Efficiency: @benxh speculated that the strategies might include multiple epochs, synthetic data, and a Phi1.5 classifier applied on the entire web.
  • Concerns about certain restrictions: @damiondreggs and @lightningralf had a discussion about some form of restriction being implemented, expressing surprise that it took a long time for such measures to be taken.

DiscoResearch Discord Summary

Only 1 channel had activity, so no need to summarize…

  • Discussion about Training and Quality of Sparsely Activated Models: User @someone13574 suggested that during training perhaps topk doesn’t have a limit, or there might be a small amount of leak. Later, @saunderez theorized that the quality drop in sparsely activated models could be associated with most MLP (Multi-Layer Perceptron) activations being zero in a sparse model.
  • Running Code on Graphic Cards: User @goldkoron stated that they used a combination of a 3090 and a 4060ti 16gb graphic cards, equating to 40gb for their tasks. When asked about the codebase by @tcapelle, @goldkoron replied that they use the Text Gen WebUI, exllamav2 with an experimental branch.
  • Accuracy of Language Models: Users @alliebot3000, @nyxkrage and @jiha discussed the imperfect accuracy of language models. @alliebot3000 mentioned how some models get extremely close but still miss the correct answer. @jiha pointed out that if a model isn’t correct, it’s simply wrong.
  • Suggestions for Improving Language Model Responses: @nyxkrage proposed a potential improvement, suggesting that asking language models to only give a final answer at the end of their explanation might yield better results. @someone13574 responded by stating this tests a different aspect, not whether the model can provide the correct response immediately.

Latent Space Discord Summary

  • Discussion regarding code interpretation and the unexpected development of Python code generating R code, unmentioned by user @slono.
  • User experiences with the functionality of Latent Space chatbot Cody’s context, with @slono expressing disappointments over its performance.
  • Technical problems faced by @slono with the chatbot not loading files from project/ directory as expected, and a humorous solution to filter out unwanted go.sum and package.lock files through a shellscript context search.
  • User @ayenem engaging in a discussion around chatbot guardrails, expressing interest in presenting a paper from arxiv.org and requesting additional resources on the topic.

Latent Space Channel Summaries

▷ #ai-general-chat (5 messages):

  • Code Interpreter R vs Python: @slono humorously noted that their code interpreter did not directly interpret R code, but rather, it created Python code that generates R code.
  • Opinion on Cody’s Context: @slono expressed dissatisfaction with the functionality of Latent Space chatbot Cody’s context, suggesting it performed below expectations.
  • Issues with Project/ Directory File Loading: @slono also reported an issue where the chatbot did not appear to load any files from the project/ directory, despite his clear indications.
  • Chatbot Listing of Unwanted Files: @slono further noted that when asked for information about files in the directory, the chatbot listed all the go.sum files which extended over several continuation steps.
  • Filtering Out Unwanted Files: To tackle the listing of unnecessary files like go.sum and package.lock, @slono jestingly mentioned a shellscript context search they created that filters out these files.

▷ #llm-paper-club (1 messages):

  • Guardrails Discussion and Resources: @ayenem is currently reading a paper about guardrails from arxiv.org and offered to present it if there’s interest in the topic. They are also seeking additional resources regarding chatbot guardrails.

LLM Perf Enthusiasts AI Discord Summary

  • Discussion regarding the use of Azure, with @rabiat expressing interest and inquiring about Azure’s rate limits, and @robotums providing personal experience on performance stating that the streaming speed being slow after the first tokens.
  • Dialogue on potential community expansion, with @jeffreyw128 suggesting whether to increase the size of the community or maintain the current state, noting the lengthy list of potential members in waiting.
  • Proposal by @jeffreyw128 to incorporate a GPT4-powered mechanism for daily summarizations of community discussions, aiming to streamline the process of keeping members informed on active conversations.

LLM Perf Enthusiasts AI Channel Summaries

▷ #gpt4 (2 messages):

  • Azure Usage and Rate Limits: User @rabiat expressed an intention to use Azure and inquired about its rate limits.
  • Performance of Azure: @robotums responded stating that the rate limits on Azure are same for them and shared their experience about the streaming speed being slow after the first tokens.

▷ #feedback-meta (3 messages):

  • Discussion on Community Expansion: @jeffreyw128 brought up the question of whether the community should invite more members or keep the current size, with him noting a large backlog of prospective members awaiting admission.
  • Proposal for Using GPT4 for Daily Summarizations: @jeffreyw128 also floated the idea of employing GPT4-powered mechanisms to generate daily summarizations of the community’s activities, to help members keep up with ongoing discussions and gain insights in a more streamlined manner.

Skunkworks AI Discord Summary

Only 1 channel had activity, so no need to summarize…

  • Evaluation of Chinese Chatbots: User @strangeloopcanon asked for insights about a benchmark evaluation done with Chinese Language Models, specifically those going beyond just running against MMLU.
  • Qwen 72B Performance: @sentdex mentioned their experience with Qwen 72B, stating its effectiveness in informational type questions and general Q&A. While it has initial weaknesses in instruction following, @sentdex suspects a properly constructed prompt could improve its performance. Interestingly, @sentdex stated a preference for Qwen 72B over Mixtral.

The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI Engineer Foundation Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Perplexity AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.