
[TOC]
OpenAI Discord Summary
- Discussion about hardware for AI with mentions of the new Mac racks being favorable for consumers, and the A6000 being a good value for its price. (
@exx1&@birdetta) - Talked about the advantages of using the Bard API for gemini pro and its free quota of 60 queries per minute. Exploration of costs associated with OpenAI’s API and its merits for experimentation. (
@thepitviper&@webhead) - Shared comparison video of Claude 2.1 and GPT 4 Turbo performance on coding tasks. Conversation included judgments on their respective competences, with GPT4 Turbo outperforming Claude 2.1. comparison video (
@rrross) - User experience discussions regarding ChatGPT Plus, varied perspectives on GPT-4 and conjecture about the upcoming GPT-5. Mention of an alleged GPT-4.5 contradicted by members, and speculation about a gradual progression to GPT-5 based on a podcast reference. (
@auracletech,@exx1,@satanhashtag,@moshymello) - Numerous user inquiries related to technical issues with OpenAI products, card payment problems, large JSON translation with ChatGPT, organization account upgrade difficulty, accidental chat archiving, and troubles with password resetting. Moreover, clarifications about server name relevance towards ChatGPT usage limits, and query on what constitutes one GPT-4 use were also brought forward.
- Further haze surrounded the Alpha feature’s status due to unanswered inquiries. Also, a similar lack of responses for queries relating to GPT-4 model usage in GitHub Copilot.
- Noted problems with broken images, uncertainty surrounding the release of GPT-5, dissatisfaction expressed about the reduced number of messages in ChatGPT Plus Membership.
- Debates about the policy on AI art generation, the importance of diverse people depiction in AI-generated scenes, and a query on limiting output tokens in a chat completion call. Additionally, the noteworthy improvement in an organization’s work features.
OpenAI Channel Summaries
▷ #ai-discussions (48 messages🔥):
- Mac Racks and A6000 for AI:
@exx1and@birdettahad a brief discussion about new Mac racks being advantageous for consumers, and the A6000 being decent for its price, respectively. - Gemini Pro and Bard API Usage:
@thepitvipermentioned the benefits of the Bard API for gemini pro being free for 60 queries per minute and how it encourages usage. He also discussed that despite the openai’s API cost, it’s fun to experiment with.@webheadcountered the cost issue, noting that it’s relatively cheap and he goes through a million tokens for just under 40 a month. - Alpha Feature Confidentiality:
@wintreasked about the status of the Alpha feature, whether it’s still confidential or removed. - Comparison of Claude 2.1 and GPT 4 Turbo for coding:
@rrrossshared a brief comparison video of Claude 2.1 and GPT 4 Turbo for coding tasks. The discussion continued with various insights, like Claude 1 being rated higher than Claude 2 and 2.1. He also compared results between Claude 2.1 and GPT4 Turbo where GPT4 Turbo performed better. - Functioning of Bard and Claude 2:
@eljajasoriginalnoted that Bard seems to be hallucinating less and performs better than Claude 2, especially for tasks requiring internet search. However, he mentioned that with a decrease in the number of requests one can make, he hardly uses Claude 2 anymore.
▷ #openai-chatter (117 messages🔥🔥):
- ChatGPT Plus Membership and Payment Issues:
@mrcrack_experienced issues with his ChatGPT Plus subscription not being recognized. It seemed his card had issues but they were resolved when he changed to a different one. Another user@5skyhad problems with his card getting declined for ChatGPT services. - Experience and Expectations of GPT-4 and GPT-5: There was a debate among members like
@auracletech,@exx1, and@auracletechregarding the performance of GPT-4 and expectations for GPT-5. Some users feel that GPT-4 was underwhelming, whereas others appreciated its feature advancements. Also, there were rumors and discussions about a supposed GPT-4.5, but@satanhashtagand@exx1clarified that these were based on fake screenshots, whereas@moshymellomentioned an indirect reference to a gradual progression to GPT-5 by Sam Altman in a podcast. - User Experience with ChatGPT: User
@yethaplayasought advice for getting OpenAI’s models to generate more complex code. User@smoosh_labooshnoted that DALL-E 3 only provides one image per prompt.@arrushcinquired about a plugin to better organize and filter conversation history with ChatGPT.@jeweismentioned an error message while using GPT3.5 despite being a ChatGPT Plus member. - OpenAI API and Voiceflow:
@zevv_.asked for resources to learn how to connect and use the assistant API with Voiceflow.@Teemusuggested YouTube tutorials by Voiceflow and their pre-built template. - Converge 2 Inquiry:
@soulztundraasked if anyone knows the age requirement or institutional requirements for Converge 2.
▷ #openai-questions (30 messages🔥):
- Tier upgrade issue:
@superseethatexpressed concerns over their organizational account not upgrading to the third-rate limit tier despite an increase in usage, charged account, and submitted issues, seeking a solution around the issue. - Unarchiving archived chats:
@martinrjsought help to unarchive chats accidentally archived, with@lumirixpointing them to the ‘Archived Chats’ section under ‘General’ in ‘Settings’. - Translating a large JSON file with Chat-GPT:
@youseifasked if there is a way to translate a big JSON file using Chat-GPT. - Password reset problem:
@marnold0015asked for assistance as the reset password email wasn’t appearing in their email box, with@satanhashtagrecommending the usage ofhelp.openai.comand sending an email to the support department. - Prompts for writing larger codebases:
@yethaplayalooked for good custom instructions to deal with longer and more complex code, seeking to bypass the bot’s default reluctance to generate such code. - Unusual system activity error:
@jeweismentioned an error message about unusual system activity when using GPT3.5, though ChatGPT4.0 worked fine. - ChatGPT 4 or 3.5 usage limit:
@kingchenggeasked whether the server names as ‘ChatGPT 4’ or ‘3.5’ matter in regards to the limit of 40 messages/3 hours. - GPT usage specification:
@solbusclarified that all custom GPTs are GPT-4 and count toward the user’s cap, sharing a link that specifies what counts as one GPT-4 use. - Billing inquiries: Both
@avnovaand@theblack.sagebrought up potential billing issues, especially upgrading to the next tier and general billing assistance.
▷ #gpt-4-discussions (28 messages🔥):
- Broken Images Issue:
@wehodlexperienced an issue with broken images and shared the forum discussion for the same. They didn’t manage to find a resolution yet. - Query About GPT-4 Model in GitHub Copilot:
@blueberryberryasked if GitHub Copilot is using the GPT-4 model. A direct answer was not provided in the reported messages. - Concern about Alpha Feature:
@wintrequestioned about the status of the Alpha feature, whether it’s still confidential or removed. This remained unanswered in the given messages. - Discussion About GPT-5 Release: Users like
@sinize,@pudochuand@sxr_discussed about the possible release of GPT-5. General consensus suggests its release might not happen soon due to issues with current models such as GPT-4 Vision and Turbo and potential challenges in providing optimal service to numerous users. No official announcement or link was given in this regard. - Concern Over Message Limit in ChatGPT Plus Membership:
@pudochuindicated some dissatisfaction over a decrease in the number of messages allotted every 3 hours in ChatGPT Plus Membership, the limit dropping from 50 to 40 messages.
▷ #prompt-engineering (5 messages):
-
Limiting Output Tokens of a Chat Completion Call:
@slobby_knobby_corny_cobbyasked for suggestions on how to limit the number of output tokens from a chat completion call through an API, without the message being cut off. -
AI Art Generation & Attribution Policy:
@mysticmarks1voiced concerns about the policy on generating AI art. They argued that the rule to not create images in the style of artists whose latest work was post-1912 creates bias and suggested a workaround by substituting the artist’s name with three adjectives that capture the key aspects of their style, attaching an associated artistic movement or era for context, and noting their primary medium. -
Diverse Representation in AI Generated Scenes:
@mysticmarks1critiqued the policy to diversify depictions of people in AI generated scenes by including descent and gender for each person. They debated that such a rule could misrepresent reality, questioning, for example, its applicability in a scene from an ancient Aztec kitchen. -
Addition of New Features to an Organization’s Work:
@mysticmarks1shared that they had significantly expanded the features of an unspecified organization’s work, adding robust capabilities.
▷ #api-discussions (5 messages):
-
Limiting Output Tokens in API Call:
@slobby_knobby_corny_cobbyasked for suggestions on how to limit the number of output tokens from a chat completion call via an API without the message cutting off. -
Concerns about Image Generation Policy:
@mysticmarks1expressed concerns about current policies in place for image generation, citing potential issues with the guidelines for recreating styles of artists and diversifying depictions of people. A particular point of contention is the policy that restricts recreation of styles of artists whose latest work was post 1912. -
Description Diversity and Realism:
@mysticmarks1pointed out a perceived contradiction in policy guidelines, noting the challenge in balancing realistic representations (e.g. not having all members of a given occupation be the same gender or race) with the need for diversity in descriptions. It was questioned how this rule would apply to historical or contextual scenarios such as an “ancient Aztec kitchen”. -
Revised Work:
@mysticmarks1claimed to have rewritten and expanded upon the organization’s work to introduce robust features and capabilities, implying dissatisfaction with the current guidelines. There is no specific information provided about the nature of these improvements.
Mistral Discord Summary
- Mistral AI and Open Source Models: Active conversations regarding the effectiveness, choices, and usage of various models offered by Mistral AI. References were made to the Medium and Mixtral models, along with supporting links to the discussions, Twitter and GitHub. A specific focus was placed on parameters tuning within Mistral’s performance.
- Building a Chatbot UI: Suggestions for open-source solutions for chatbot development were posted, providing links to Chatbot-UI and OpenAgents and their respective GitHub repositories.
- Comparison of AI Models Performance: Comparisons were drawn between various AI models, including theMedium and Mixtral models. An event, “AI Hack Night”, was organized to explore the application of the Mistral’s Language Learning Model with the Assistant’s API. Mentioned the perceived decline in benchmark performance due to intricate version-controlled API system requirements.
- Fine-tuning within Mistral AI: Discussions on fine-tuning capabilities explored whether the current API supported fine-tuning. The question was addressed, indicating that the current API does not have fine-tuning offerings.
- Contributions and Projects: Links to a Nuget package for using the Mistral AI platform in .NET were shared. Calls for collaboration on a project were made, and an open attitude towards ideas and modifications was expressed. Discussions also covered exploration of making calls to the Mistral API.
- Quantization and Platform Integrations: An explanation of the process of quantization was offered. Some users expressed desire for a feature allowing token usage tracking, with confirmation of such a feature being in the works. Questions were raised about platform integration and the number of experts employed by the Mistral Medium API.
Mistral Channel Summaries
▷ #general (180 messages🔥🔥):
- Mistral API and Open Source Models: Users discussed Mistral API and open-source models
@lee0099,@cyborgdream, and@marketingexpertdebated their preferences and options for using different models via the API, including the Medium and Mixtral models.Link to discussion - Creating a Chatbot UI:
@cyborgdreamsuggested open-source UI solutions for developing a chatbot, providing links to the GitHub repositories of Chatbot-UI and OpenAgents. - API Usage and Waitlist: User
@marketingexpertexpressed eagerness to use the Mistral API and voiced frustration over the waitlist.@tlacroix_explained the careful process of load management to ensure a good user experience. - Mistral Models Performance:
@jamsynsshared his experience of finding Mistral-medium more efficient at coding than ChatGPT 4 and provided a Twitter link for reference. - AI Hack Night using Mistral LLM with Assistants API:
@louis030195organised an AI hack night event focused on using Mistral’s Language Learning Model (LLM) with Assistant’s API. A link to the event was shared.
▷ #models (10 messages🔥):
- Mistral-specialize in Non-human lines:
@reguilecommented that Mistral’s experts seem to specialize along non-human lines such as nouns and punctuation. - Question about Mistral’s Fine-Tuning:
@pdehayeasked if Mistral’s 8 experts are fine-tuned using human-curated datasets, inquiring about the human logic behind this decision. - Performance Tuning in Mistral:
@sviluppfound through experimentation that altering top-p and temperature settings can nearly replicate GPT-4 performance, observing nearly a 30-point increase from default parameter settings on their benchmark. However, they noted a major performance drop after the Mistral-medium API was seemingly changed, affecting their expected benefits. - Performance comparison:
@sviluppfurther mentioned that while they like Mistral’s responses, there is a noticeable difference in consistency and handling of more complex prompts when compared to[GPT-3.5-turbo-1106](https://github.com/svilupp/Julia-LLM-Leaderboard), and evenMistral-small. - Version Control Challenge in LLM:
@casper_aireinforced the difficulty in serving large language models due to the need for an intricately version-controlled API system for reproducible results. This requirement, however, comes at a cost, leading to possible decline in benchmark performance.
▷ #ref-implem (1 messages):
- Integration of Jinja Templates in Projects:
@titaux12expressed an interest in having a “working-out-of-the-box” Jinja template for facilitating projects that run multiple LLMs seamlessly. This request comes in the context of work on supporting multiple models within privateGPT, where Jinja templates are seen as an efficient and elegant way to manage varying prompt formats as required by different LLMs. Relevant work is outlined in a GitHub pull request. - Comparison with vLLM Works:
@titaux12mentioned that similar work is being undertaken by the vLLM team, who are perceived as being more advanced in their endeavors as they already employ these files. - Model Output Format Compliance:
@titaux12indicated they will adapt to ensure their model delivers the output format required by the other user’s model, namely maintaining the integrity of the provided token representation, with1(BOS) as the first token and avoiding duplication or encoding as a different token.
▷ #finetuning (3 messages):
- Fine-tuning Capabilities in Current API: User
@jamiecropleyraised a question about whether fine-tuning can be done in the current API.@lerelaconfirmed that the current API does not have fine-tuning capabilities, and encouraged users to share their use cases with the support email for Mistral AI.
▷ #showcase (10 messages🔥):
- .NET library for the Mistral AI platform: User
@vortacsshared a link to the .NET library he developed for the Mistral AI platform, with both the list models and chat endpoints implemented. The package is available on Nuget. - Looking for Collaborators: User
@tonic_1expressed interest in getting collaborators for a project, and is also open to others using the idea and making modifications. - Contributions on Github: Both
@aha20395and@fayironshared that they have not done programming or contributed to anything on Github for a long time. Despite that,@aha20395expressed his support and willingness to help in non-coding aspects. - Explored Mistral API:
@tonic_1mentioned that he is at the early stage of exploring how to make calls to the Mistral API.
▷ #la-plateforme (17 messages🔥):
- Comparison between Mistral-tiny and GGUF Mistral-7B-Instruct-v0.2 performance: User
@tarrudafinds the performance of Mistral-tiny to be significantly superior to the Q8 GGUF version of Mistral-7B-Instruct-v0.2 on huggingface. The tests were performed using temperature 0 on both and the prompts can be found in this benchmark link. - Discussions on Quantization: Users
@someone13574and@titaux12explained the basic idea of quantization. According to them, inllama.cpp, Q8_K is done by grouping parameters into groups of 256 and finding aweight_scalefor each block. Weights are stored as 8 bit integers which are multiplied by the scale to get the original weights (with some errors). - Request for Usage Tracking: User
@phinder.airequests for a feature to track the usage of tokens.@tlacroix_confirms the addition of this feature in the next week. - Questions about Platform Integrations:
@subham5089inquires about llamaindex integration with the platform. - Question about the number of experts in the Mistral Medium API: User
@timotheeee1asks whether the mistral medium API uses 2 or 3 experts. They speculated that increasing to 3 experts might improve the model’s performance.
Nous Research AI Discord Summary
-
Discussions around AI model choice and preference: Users shared their approaches and preferences for AI models and embeddings, such as GTE-small and Jina. An upcoming option to embed the entire Arxiv database was referenced, although concerns about the effectiveness of mixing embeddings were expressed. Fine-tuning and interview models were also discussed, outlining the potentials and challenges of different model training strategies.
-
Release of user-generated AI models: The Metis 0.1 model for reasoning and text comprehension was released on huggingface, clauses about downloading, quantizing, and evaluating the model were also shared. The first Phi 2 GGUF surfaced and was announced.
-
Acquisition and application of GPU resources: The discussion featured interest in acquiring MI300X GPUs, and a blog on how to run Nvidia SXM GPUs in consumer PCs was linked.
-
Interest in Peer-to-peer (P2P) and Distributed Compute Networks: Several resources were shared, including Petals, Bacalhau/Filecoin, Bittensor, Hyperspace, Akash, and Shoggoth Systems.
-
Updates on AI industry news: It was noted that OpenAI suspended ByteDance’s account for violating the developer licenses, and Microsoft’s announcement about the public availability of GPT-4 Turbo with Vision on Azure OpenAI Service was raised for discussion.
-
Inquiries on fine-tuning and operational aspects of open models like Mistral and Openhermes, with specific inquiries about recovering logprobs and measuring text coherence. Conversations showcased the dynamic nature of fine-tuning Models, considering things like using a
<backspace>token or models providing self-correction. The Mamba chat repository and related research papers were shared in the discourse on state. space model architectures.
Nous Research AI Channel Summaries
▷ #off-topic (21 messages🔥):
- Choice of AI Model and Approach: User
@adjectiveallisonexpressed confusion over the several available AI models and inference options.@lightningralfrecommended waiting if possible, mentioning an upcoming option to embed the entire Arxiv database. However,@lightningralfalso expressed concerns about the effectiveness of mixing embeddings. - Differing Model Uses:
@natefyi_30842shared their preference for using thegte-smallmodel and suggested that larger embeddings likeJinaare better for larger projects, such as books. - Fine-Tuning and Interview Models:
@a.asifis considering fine-tuning a model to act as an interviewer, asking questions based on provided scenario cards.@.beowulfbrsuggested using system prompts or theRAGmethod if the scenarios fit within the context window. However,@a.asifbelieves that due to a large amount of scenarios, fine-tuning might be a better approach and would lead to a model suited for a specific use case. - GPU:
@dragan.jovanovichinquired about avenues to purchaseMI300XGPUs. - Embedding Techniques:
@nikhil_thoratshared their approach: usinggte-smallfor sentence/page level andJinafor full document embeddings, highlightingJina’s effectiveness for clustering.
▷ #interesting-links (17 messages🔥):
- Metis 0.1 Model Fine-tune: User
@mihai4256announced they’ve created a 7b fine-tuned model, Metis 0.1, for reasoning and text comprehension, which is now available on huggingface. The model isn’t aimed for use in story telling, but rather for reasoning and text comprehension tasks. It’s trained on a private dataset and not the MetaMath dataset. - Downloading & Quantizing Metis 0.1: User
@.benxhconfirmed downloading and quantizing the Metis 0.1 model for testing. They expressed a preference for Q6 for 7B due to its high quality and extremely fast performance, while@mihai4256affirmed using Q8 GGUF due to its large sampling speed when running on VRAM. - Adopting Nvidia SXM GPUs for Consumer PCs:
@giftedgummybeeshared a blog post on how to run Nvidia SXM GPUs in consumer PCs and third party servers as alternatives to consumer-grade GPUs, which have now become able to match datacenter GPU performance. - Use of ChatML for Inference Libraries: User
@.benxhsuggested the use of ChatML for inference libraries in future, stating it’s easier to use and makes models more accessible. - First Phi-2 GGUF:
@tsunemotoannounced the surfacing of the first Phi 2 GGUF and that it’s a quantized version of Microsoft Phi-2 to 4_0, 8_0 bits and converted to a GGUF 16FP model. However, he clarified that it won’t work unless a specific fork with modifications is used.@nonameusrexpressed their intention to test the model.
▷ #general (41 messages🔥):
-
P2P and Distributed Compute Networks:
@yikesawjeezshared a variety of resources and information about peer-to-peer (P2P) and distributed compute networks including Petals, Bacalhau/Filecoin, Bittensor, Hyperspace, Akash, and Shoggoth Systems. -
Issues with ChatGPT:
.interstellarninjamentioned problems with ChatGPT similar to those previously encountered with Llama-2, specifically it failing to help with terminating processes. -
Experimentations with Mixtral:
@nonameusrreported that they have been testing Mixtral and finding it entertaining. -
Comparison of Models:
@shane_74436asked@879714655356997692for comparative metrics on the base mixtral, mixtral’s instruct version, and the model they built. They noted mixed performance on The Gradient Institute (TGI), but were unsure if the platform, their prompting, or the model was responsible. -
News Updates:
@atgctgshared an article from The Verge noting that OpenAI has suspended ByteDance’s account for using GPT-generated data in violation of Microsoft and OpenAI’s developer licenses.@lightningralfcommented that this might prompt Chinese buyers to turn to Mistral instead.@giftedgummybeeshared a tweet by Greg Brockman and expressed their confusion about the paper it discussed, and later flagged a Microsoft announcement about GPT-4 Turbo with Vision being publicly available on Azure OpenAI Service and asked@550656390679494657if they had seen it. -
Request for help with Fine-tuning Open Models:
@realsedlyfasked if anyone had resources or ideas about fine-tuning open models like Mistral and openhermes, specifically for function calling datasets.
▷ #ask-about-llms (11 messages🔥):
- State. Space Model Architectures:
@vincentweisserinitiated a discussion on state. space model architectures like Mamba and shared a link to the github repository of Mamba, a chat Language Learning Machine (LLM) that uses a state-space model architecture. In addition, he also shared a related research paper on the topic. - Fine-tuning Models:
@jason.todayproposed the idea of fine-tuning models to give them the inherent capability of correcting their outputs, such as re-reading and correcting themselves and giving generalized commands like inserting and deleting. However,@atgctgcountered that this might make the model always output incorrect tokens at first. token : In the fine-tuning discussion,@atgctgmentioned a related research paper that introduces the use of<backspace>token.- Availability of Compute Resources:
@yikesawjeezoffered to share computing resources they recently acquired, specifically targeting open-source projects or endeavors contributing to humanity or creating quality memes. - Recovering Logprobs from Open Source Models:
@adjectiveallisoninquired about the standard approach to recover logprobs returned from open source models from specific inference providers or locally, to which@atgctgresponded that the platform Together supports this feature. - Measuring Text Coherence:
@xela_akwaasked if there are ways or benchmarks to measure the coherence of a text.
OpenAccess AI Collective (axolotl) Discord Summary
- Continued dialogue around the fine-tuning of Phi-2 Base, with user
@noobmaster29sharing a tweet suggesting its potential for further development. This resulted in discussions about other preferred Language Learning Model (LLM) bases, including Mistral 7B and Hermes. - Detailed discussions on the technical aspects and performance of AMD’s new product, the MI300x, and its competitive edge over Nvidia’s H100 in terms of inference speed, shared by user
@casper_ai. This sparked further conversation about the potential extension of ROCm support to AMD’s consumer cards. - Group interaction around troubleshooting with HF Transformers, with solutions for issues shared within the community. Notably, updates to NCCL and Nvidia were reported to resolve some hangups during training. A PR request for Transformers was opened and shared within the guild.
- Active discussions about different Training Configurations, with insights about FFT of mixtral training revealed. Further explorations on potential implementation of fusedmlp for Mixtral’s experts surfaced, with proposed memory savings being a noticeable topic of interest.
- Consultations on certain issues such as the functioning of a pod and the double EOS token issue. Searches for solutions to stop unnecessary run of containers after a self-contained training task was also a pressing topic, with proposed ways of using Runpod’s native API keys and suggested implementations of callbacks for task termination. A question about viewing images without a base URL was raised but remained unanswered.
OpenAccess AI Collective (axolotl) Channel Summaries
▷ #general (8 messages🔥):
- Phi-2 Base for Fine-tuning:
@noobmaster29shared a tweet by Sebastien Bubeck, which suggests that phi-2 is really a good base for further fine-tuning. He details that they fine-tuned on 1M math exercises and tested on a recent French nationwide math exam with encouraging results. - Preferred LLM Base Model:
@jovial_lynx_74856surveyed the collective’s preferred Language Learning Model (LLM) base, mentioning Mistral 7B and Hermes as options. - AMD Vs Nvidia Inference Speed Competitions:
@casper_ainotified the group of AMD’s new product, the MI300x, which allegedly outperforms Nvidia’s H100 on inference speed while costing 50% less. He also included a link to the relevant AMD community post. - ROCm Support for AMD Consumer Cards:
@le_messexpressed desire for ROCm support to extend to AMD’s consumer cards. - Mixtral Medium Benchmarks:
@dangfuturesinquired if there were any benchmarks available for Mixtral Medium.@le_messresponded that some benchmarks are available on the company’s website.
▷ #axolotl-dev (49 messages🔥):
- Troubleshooting with HF Transformers: User
@hamelhmentioned having an issue with HF Transformers.@caseus_provided a link to a PR on Github that was reported to fix the issue. - Updates on NCCL and Nvidia: User
@richbrainshared that after updating NCCL to 2.19.3 and Nvidia to 23-10, they no longer experienced hangups after 1 hour of training time. The update was well-received by the community. - PR Request for Transformers:
@caseus_requested for a PR to be opened, with the new Transformers version pinned in the requirement files.@richbrainfulfills the request, offering a link to the PR. - Discussing Training Configurations:
@yamashiand@casper_aiengaged in discussions with@richbrainon their FFT of mixtral training configurations. Richbrain revealed they were running on 2 nodes with H100 GPUs, and a sequence length of 2048. - Potential Implementation of FusedMLP:
@caseus_and@casper_aidiscussed projected memory savings of implementing fusedmlp for Mixtral’s experts. Casper gave an estimate of 8GB VRAM in memory saving, based on 1GB memory saving observed in Llama 7B MLP.
▷ #general-help (4 messages):
- Pod Running Issue:
@dangfuturesinquired about potential issues with the functioning of a pod, unsure whether it is buggy. - Double EOS Token Solution:
@noobmaster29suggested a solution to fix the double EOS token issue. The proposed solution involves making changes in the sharegpt.py file. Specifically, removing the ‘sep style’ and adding a `stop_str=”
▷ #runpod-help (6 messages):
- Using RUNPOD_API_KEY and RUNPOD_POD_ID to control Runpod:
@caseus_explained that Runpod provides environment variables (RUNPOD_API_KEYandRUNPOD_POD_ID) which can be used to make API calls to control the pod, potentially to shut it down. They also posted a link to Runpod’s “Stop Pod” documentation and shared an example curl command, but noted that “stop” isn’t the same as “terminate”. - Configuring Runpod container to exit after task completion:
@_jp1_expressed a desire for a Runpod container to exit as soon as a self-contained training task was complete, as Runpod’s default behavior appears to be to restart the container indefinitely. Because running the container can be costly, they were looking for a solution that wouldn’t require an external server or service to monitor the job and stop the container. - Suggestion for Automatically Stopping Runpod After Task Completion: In response to
_jp1_,@caseus_suggested setting up a callback at the end of training that would shut down the container once the task was complete. This could possibly involve a timer to make sure the final checkpoint is uploaded before shutdown. - Viewing Images Without a Base URL:
@mustapha7150asked how to view an image when the output did not contain a base URL. This question wasn’t answered in the provided messages.
HuggingFace Discord Discord Summary
- Addressing technical issues involving AI model training, Gradio’s API server, and FastAPI execution with KoalaAI/Text-Moderation. Specific references include a HuggingFace’s tutorial and a GitHub issue related to the FastAPI error.
- Active dialogue about AI model performance and implementations including use cases, performance characteristics and comparisons. Models discussed extensively include the cogvlm model, Phi-2, deepseek-coder, and go-bruins-v2.1.1.
- Shared projects and demonstrations, like the monetized speech-to-text app using a Web3 API gateway YouTube link, Diffusion-Cocktail’s demo and a link on generating hyper-realistic faces with AI.
- Solicitation of help and advice on projects, such as converting PHP projects to Laravel using models like CodeGen2.5 and StarChat Beta, converting Animate Diff .ckpt files to .bin files and creating datasets from raw song vocals and timed lyrics.
- Spotlight on the importance of hardware resources, especially a high VRAM GPU, for successful model inference.
HuggingFace Discord Channel Summaries
▷ #general (26 messages🔥):
- Diffusion Models Training Issue: User
@everythingingreports generated noise images when attempting to train on their own images following a tutorial on HuggingFace’s official tutorial. They question if there is a minimum size requirement for the dataset. - API Connection Issue: User
@paghkmanidentified that Gradio client library’s API link is currently down and returning an error. - Discussions on Multimodal models: Users
@doctorpanglossand@asrielhanpraise the effectiveness of the cogvlm model as a multimodal model. Moreover, user@asrielhanalso mentioned another new model here. - Discussion on Training Models:
@vipitisshared their experiences with model Phi-2, stating it performs well only on the domains it’s trained on and shared a comparison here. - Code Conversion Project:
@pyr0t0nasks about a project aimed at converting complete programming projects from PHP to the Laravel Framework for PHP. User@merve3234responds by suggesting to check out code instruction models like CodeGen2.5 and StarChat Beta.
▷ #today-im-learning (1 messages):
nixon_88316: Hello! Is there anyone to add any model to stabilityAI?
▷ #cool-finds (1 messages):
- Generating Hyper-Realistic Faces with AI: User
@kingabzproshared a link to a blog post from KDNuggets detailing three ways to generate hyper-realistic faces using AI, mainly through prompt engineering, the Stable Diffusion XL model, and a custom model from CivitAI. The post offers solutions to those struggling with generating quality AI images that often end up full of glitches and artifacts.
▷ #i-made-this (7 messages):
- Model Generation Improvements:
@vipitisreported improved results after fixing the post-processing for model generation. They praised the performance of deepseek-coder, expressing eagerness to run larger variants of it. - Web3 API Application:
@dsimmodesigned a monetized speech-to-text application using a Web3 API gateway. They shared a YouTube link to inspire others to create similar applications. - New Model Release:
@rwitz_announced a new model release,go-bruins-v2.1.1, claiming it to be the top-rated model of any size on the leaderboard as of 12/16/2023. The model was DPO-trained onIntel/orca_dpo_pairs. - Image Content/Style Manipulation Method:
@.ricercarproposed a training-free method for image content/style manipulation. They shared a demo illustrating the integration of pre-trained DMs and LoRAs. - Demo Feedback and Troubleshooting: User
@jo_pmt_79880experienced issues with the image output in.ricercar’s demo, with images coming back empty.@.ricercarasked the user to refresh the browser and try again.
▷ #reading-group (1 messages):
pcuenq: That’d be great <@582573083500478464>!
▷ #diffusion-discussions (2 messages):
- Convert Animate Diff .ckpt to .bin files:
@jfischoffasked if there’s a way to convert the Animate Diff .ckpt file to .bin files that diffusers motion adapters understand.
▷ #computer-vision (1 messages):
vision12: Hey there, does anyone know how can i convert timm model weights to keras weights
▷ #NLP (3 messages):
- GPU Requirements for Inference: User
@nerdimosuggested that for the inference process, a strong GPU with a high amount of VRAM is essential, otherwise inference could be problematic. - FastAPI Execution Error with KoalaAI/Text-Moderation: User
@marshy.devreported an issue when running KoalaAI’s Text-Moderation model through FastAPI, facing a RuntimeError. Shared link to a related GitHub issue regarding the error and provided their current approach code that is causing the error. - Creating Datasets from Song Lyrics with Timestamps: User
@sushi057asked for help in creating datasets from raw song vocals that are to be mapped to timed lyrics to aid their model.
▷ #diffusion-discussions (2 messages):
- Conversion from .ckpt to .bin for Diffusers Motion Adapters: User
@jfischoffinquired about possible methods to convert the Animate Diff .ckpt files into .bin files that diffusers motion adapters can understand.
LangChain AI Discord Summary
- Users engaged in various needs for technical assistance, with requests including advice on deleting documents from a Faiss store in Node.js, estimating the costs of hosting a RAG application with AWS, and inquiring about older versions of the LangChain API. Current avenues for help include an invitation from
@holymodehelping@quantumqueenxoxconvert synchronous to asynchronous code. - An imminent OCR feature in Vectara, possibly beneficial for LangChain customers was mentioned by
@ofermend. Early access to the new feature was offered through a waitlist link. - LangChan-ai’s GitHub repository was reported by
@arborealdaniel_81024as a broken link when checking the release notes template. @_johnny1984sparked a conversation on potential security risks with AI, possibly related to a disclosed association with OpenBSD. They also revealed a willingness to provide Ruby examples if needed.- The community saw updates from two projects.
@discossishared a web scraping feature for an AI chatbot on their llama-cpp-chat-memory GitHub repository, and@heydianaaannounced a discord bot transforming images to dance videos, open for feedback at kineticpix.ai.
LangChain AI Channel Summaries
▷ #general (11 messages🔥):
- Deleting Documents from Faiss Store in Node.js:
@binodevasked how to delete documents from a Faiss store in Node.js. No solutions or further discussion has been provided yet. - RAG Application Costs with AWS:
@legendary_pony_33278sought advice on estimating costs of hosting a RAG application on AWS. They’re currently using Mixtral 8x7B or Llama 2 13b models along with a small vectordb. No suggestions have been provided yet. - OCR Capability with Vectara:
@ofermendannounced an upcoming OCR feature in Vectara that could be useful for customers integrating it with LangChain, especially for processing scanned documents. They shared a waitlist link for those interested in early access. - Inquiry about Older LangChain API:
@damianj5489was looking for an older version of the API for LangChain but could only find the newest version (0.0.350) through the provided link. No responses yet. - Using AI to Help in Legal Matters:
@_johnny1984proposed using AI to help his girlfriend win a custody battle, though she was sceptical. No feedback or suggestions were given yet. - Request for Help with Asynchronous Code:
@quantumqueenxoxasked for assistance in converting their code to an asynchronous pattern, offering payment for this service.@holymodeexpressed willingness to help, though the details are being discussed privately.
▷ #langchain-templates (1 messages):
- Broken Link in Release Notes: User
@arborealdaniel_81024noted that a link provided in the release notes was broken. The link supposedly would have led to a template mentioned in the notes. The link:https://github.com/langchain-ai/langchain/tree/master/templates/rag-chroma-dense-retrievalpoints to a GitHub repository but it seems to be broken or non-existent.
▷ #share-your-work (5 messages):
- Security Concerns: User
@_johnny1984initiated a discussion about the security of AI, speculating about the risks and implications of an AI capable of hacking systems. - OpenBSD AI Development:
_johnny1984also disclosed his affiliation with OpenBSD, suggesting he is working on creating a hacking-enabled AI. - Web Scraping in AI Chatbot:
@discossishared an update about adding basic web scraping functionality for document memory in his AI chatbot. He provided a link to the GitHub repository for the project, llama-cpp-chat-memory. - Pic-To-Dance-Video Bot:
@heydianaahas developed a free-to-use discord bot that can convert pictures into dancing videos. They encouraged users to register, try it out, and provide feedback at kineticpix.ai.
▷ #tutorials (2 messages):
- Ruby Examples Offer: User
@_johnny1984offered to provide Ruby examples upon request.
Alignment Lab AI Discord Summary
Only 1 channel had activity, so no need to summarize…
- Discussion on Phi-2 Training Efficiency: Members were curious about how Microsoft trained its model Phi-2 so efficiently.
@autometaclaimed the model is heavier on GPUs thanLlama 13band brought up the confusing efficiency claim of using90-something a100s for 20 days. - Speculations about Phi-2’s Efficiency:
@benxhspeculated that the strategies might include multiple epochs, synthetic data, and a Phi1.5 classifier applied on the entire web. - Concerns about certain restrictions:
@damiondreggsand@lightningralfhad a discussion about some form of restriction being implemented, expressing surprise that it took a long time for such measures to be taken.
DiscoResearch Discord Summary
Only 1 channel had activity, so no need to summarize…
- Discussion about Training and Quality of Sparsely Activated Models: User
@someone13574suggested that during training perhaps topk doesn’t have a limit, or there might be a small amount of leak. Later,@saundereztheorized that the quality drop in sparsely activated models could be associated with most MLP (Multi-Layer Perceptron) activations being zero in a sparse model. - Running Code on Graphic Cards: User
@goldkoronstated that they used a combination of a 3090 and a 4060ti 16gb graphic cards, equating to 40gb for their tasks. When asked about the codebase by@tcapelle,@goldkoronreplied that they use the Text Gen WebUI, exllamav2 with an experimental branch. - Accuracy of Language Models: Users
@alliebot3000,@nyxkrageand@jihadiscussed the imperfect accuracy of language models.@alliebot3000mentioned how some models get extremely close but still miss the correct answer.@jihapointed out that if a model isn’t correct, it’s simply wrong. - Suggestions for Improving Language Model Responses:
@nyxkrageproposed a potential improvement, suggesting that asking language models to only give a final answer at the end of their explanation might yield better results.@someone13574responded by stating this tests a different aspect, not whether the model can provide the correct response immediately.
Latent Space Discord Summary
- Discussion regarding code interpretation and the unexpected development of Python code generating R code, unmentioned by user
@slono. - User experiences with the functionality of Latent Space chatbot Cody’s context, with
@slonoexpressing disappointments over its performance. - Technical problems faced by
@slonowith the chatbot not loading files from project/ directory as expected, and a humorous solution to filter out unwantedgo.sumandpackage.lockfiles through a shellscript context search. - User
@ayenemengaging in a discussion around chatbot guardrails, expressing interest in presenting a paper from arxiv.org and requesting additional resources on the topic.
Latent Space Channel Summaries
▷ #ai-general-chat (5 messages):
- Code Interpreter R vs Python:
@slonohumorously noted that their code interpreter did not directly interpret R code, but rather, it created Python code that generates R code. - Opinion on Cody’s Context:
@slonoexpressed dissatisfaction with the functionality of Latent Space chatbot Cody’s context, suggesting it performed below expectations. - Issues with Project/ Directory File Loading:
@slonoalso reported an issue where the chatbot did not appear to load any files from the project/ directory, despite his clear indications. - Chatbot Listing of Unwanted Files:
@slonofurther noted that when asked for information about files in the directory, the chatbot listed all thego.sumfiles which extended over several continuation steps. - Filtering Out Unwanted Files: To tackle the listing of unnecessary files like
go.sumandpackage.lock,@slonojestingly mentioned a shellscript context search they created that filters out these files.
▷ #llm-paper-club (1 messages):
- Guardrails Discussion and Resources:
@ayenemis currently reading a paper about guardrails from arxiv.org and offered to present it if there’s interest in the topic. They are also seeking additional resources regarding chatbot guardrails.
LLM Perf Enthusiasts AI Discord Summary
- Discussion regarding the use of Azure, with
@rabiatexpressing interest and inquiring about Azure’s rate limits, and@robotumsproviding personal experience on performance stating that the streaming speed being slow after the first tokens. - Dialogue on potential community expansion, with
@jeffreyw128suggesting whether to increase the size of the community or maintain the current state, noting the lengthy list of potential members in waiting. - Proposal by
@jeffreyw128to incorporate a GPT4-powered mechanism for daily summarizations of community discussions, aiming to streamline the process of keeping members informed on active conversations.
LLM Perf Enthusiasts AI Channel Summaries
▷ #gpt4 (2 messages):
- Azure Usage and Rate Limits: User
@rabiatexpressed an intention to use Azure and inquired about its rate limits. - Performance of Azure:
@robotumsresponded stating that the rate limits on Azure are same for them and shared their experience about the streaming speed being slow after the first tokens.
▷ #feedback-meta (3 messages):
- Discussion on Community Expansion:
@jeffreyw128brought up the question of whether the community should invite more members or keep the current size, with him noting a large backlog of prospective members awaiting admission. - Proposal for Using GPT4 for Daily Summarizations:
@jeffreyw128also floated the idea of employing GPT4-powered mechanisms to generate daily summarizations of the community’s activities, to help members keep up with ongoing discussions and gain insights in a more streamlined manner.
Skunkworks AI Discord Summary
Only 1 channel had activity, so no need to summarize…
- Evaluation of Chinese Chatbots: User
@strangeloopcanonasked for insights about a benchmark evaluation done with Chinese Language Models, specifically those going beyond just running against MMLU. - Qwen 72B Performance:
@sentdexmentioned their experience with Qwen 72B, stating its effectiveness in informational type questions and general Q&A. While it has initial weaknesses in instruction following,@sentdexsuspects a properly constructed prompt could improve its performance. Interestingly,@sentdexstated a preference for Qwen 72B over Mixtral.
The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The AI Engineer Foundation Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The Perplexity AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.