https://fxtwitter.com/abacaj/status/1736819789841281372?s=46&t=90xQ8sGy63D2OtiaoGJuww
This fits in to an established pattern of prompt roleplay enhancing capabilities, but also a reminder that HumanEval is pretty terrible as a metric.
[TOC]
OpenAI Discord Summary
-
Comparison of different language models (GPT-4 Turbo, GPT-3.5 Turbo, Claude 2.1, Claude Instant 1, and Gemini Pro) shared by
@rrross
. GPT-4 Turbo provided the most user-centric explanation when challenged with describing the impact of user onboarding tracking shift. -
Discussion about the rumored GPT-4.5 version involved several members including
@feltsteam0
and@jaicraft
. Participants agreed to continue considering it non-existent until official declarations or clear evidence become available. -
Addressed multiple technical challenges encountered by users, such as slow response times, unspecified errors, blocked accounts, and issues with API access across platforms like GPT utilities and ChatGPT Plus.
-
Shared experiences surrounding role-play mode in system prompts during API discussions, with suggestions to maintain the first-person perspective using reminders in user message strings or appending notes to instructions.
-
Expressed concerns over ethical implications of AI usage in academia and the job market. Debates around potential misuse, plagiarism, and job replacements ensued.
-
Explored potential future feature implementations in AI models, notably Dalle 3 and a proposed new GPT model by
@7877
. Although the conversation about features in Dalle 3 was more speculative, the discussion around a new GPT model lacked conclusive details. -
Request for help from
_helium.
for a school project to develop a language translation website using OpenAI API but unfortunately, no specific suggestions were provided.
OpenAI Channel Summaries
ā· #ai-discussions (22 messagesš„):
- Experiment on Different Language Models: User
@rrross
shared an experiment comparing the responses of different language models (GPT-4 Turbo, GPT-3.5 Turbo, Claude 2.1, Claude Instant 1, and Gemini Pro) when asked to explain the impact of a change from local to server-side user onboarding tracking.@rrross
observed that GPT-4 Turbo provided the most user-centric explanation. - Question on AI Translation Website: User
_helium.
requested assistance for a school project involving the development of a language translation website using the OpenAI API. No specific responses or suggestions were given in the messages that followed. - AI Glasses Discussion: Users
@dunescifye
and@lugui
briefly exchanged thoughts on AI glasses.@lugui
commented that the technology sounds better in theory than in practice but did not provide any specific challenges or problems associated with AI glasses. - Potential Features of Dalle 3 in ChatGPT: User
@satanhashtag
expressed a wish for Dalle 3 to have features such as mid-journey variation and editable zones, to which@kyoei
responded that such functionalities will likely be introduced eventually. This was followed by jokes about a possible Dalle 3.5 version. - New GPT Model Proposal: User
@7877
mentioned developing a new GPT model and offered to send a link to it for others (@mawzoon
and.pythagoras
) to try and provide feedback before public release. However, no actual link or further details about this new GPT model was shared in the following messages.
ā· #openai-chatter (750 messagesš„š„š„):
-
Discussion on GPT-4.5: Members
@feltsteam0
,@jaicraft
, and others discussed the alleged existence of GPT-4.5, with most expressing skepticism as reports suggest it doesnāt exist. One user quoted from a conversation between Joe Rogan and Sam Altman where a future GPT-4.5 was mentioned. However, most participants agreed that until an official statement or clear evidence is provided, itās best to consider GPT-4.5 as non-existent. -
Concerns About AI Input Limits and General Performance: User
@jaicraft
voiced frustration about the input limit for developing a model with GPT, while@picturesonpictures
expressed dissatisfaction with the charging of failed prompts. -
Discussion on AIās Influence on Jobs: Users debated the potential impacts of AI on the job market. Some believed that AI increases workplace productivity, while others expressed concerns about the potential of AI replacing human jobs. Suggestions were made for responsible and ethical practices in AI implementation in education and business.
-
Using AI for Web Development and Academics:
@bloodgore
shared a discussion about the inappropriate usage of ChatGPT by his students to write academic papers. Others suggested different methods to detect AI-generated content. Elsewhere in the discussion,@mysticmarks1
spoke about the future potential of AI in creating web solutions and@msirene
queried if using ChatGPT was equivalent to plagiarism. -
Issues with Credit Card Payments and Regulations: User
@msirene
faced an issue where the company card was declined after multiple usage for creating accounts for their employees.@elektronisade
shared OpenAIās policy on credit card usage limits. Also, there was debate on whether OpenAI should be a āpaid-onlyā service to discourage misuse by underage users, sparked by@bloodgore's
statement about students misusing the tool.
ā· #openai-questions (90 messagesš„š„):
-
Slow Response Times and Errors: Several users including
@scrambler803
,@mesteviet
, and@bittychills
reported slow response times and unspecified errors while using GPT utilities.@scrambler803
suggested the issue might have to do with the length of the ongoing chat.@healer9071
attempted to troubleshoot the problem.@keith_15641
also reported slow response times and errors with GPT-4. -
Account and API Issues:
@dian2024
,@mildare
,@pikapikapu4578
, and@whitchurch
all reported issues with their accounts or API access. The problems ranged from blocked accounts to challenges with the API quota.@millionwords
reported a transaction issue: a purchase of a ChatGPT Plus subscription debiting funds, but the subscription not reflecting on the app or website. -
Problems with Custom GPTs and Output: Various problems were reported regarding the usage and output of custom GPTs by
@unfor_tuna_te
with photorealistic face generation and@jobydorr
with content retrieval from uploaded PDFs.@arthurchance
encountered issues with a QR code meant to link to a custom GPT. -
Other Technical Issues:
@drpossum
,@jhwarehouse
,@ashtonwin
,@couchlannister
, and@explosiveburrito
mentioned receiving an āunusual activityā error message.@debugpvp
asked for guidance on getting around the issue of a token limit.@aesthetic_person_123
and@andrewwallo
were facing network errors, mainly during long conversations. -
ChatGPT 4 Abilities: Conversation about the capabilities of ChatGPT 4 took place between
@jah777
and@andrewwallo
with agreement on better speed, accuracy, and knowledge compared to the free version.
ā· #gpt-4-discussions (22 messagesš„):
- GPT-4 Access to the Internet:
@karajan_
asked if GPT-4 has access to the internet. The question wasnāt answered directly by the members. - Finding Model ID: User
@lasche
inquired about finding their model ID for GPT. The question didnāt get any response in the messages. - Connecting Zapier to Custom GPT Actions via Webhooks:
@db_looper
queried if anyone has successfully sent custom GPT data to a webhook via actions, also discussing about an error encountered while trying to use a make webhook instead of Zapier. This query remained unanswered. - Limit to the Number of Files in Custom GPT:
@jobydorr
raised a question about the limit to the number of files that can be uploaded in a custom GPT.@auktow
responded that the limit is 10 files based on their own experience and also shared a link to OpenAI community post which might be helpful. - Parsing Large Files with GPT:
@auktow
shared tips on better performance when using text-based files instead of PDFs, especially while dealing with large files. He shared another OpenAI community post discussing successful experiences with parsing files. - Understanding GPT Assistantās API Function Calling:
@crazygreenguy
brought up a discussion about how function calling works with the GPT Assistantās API, questioning about the requirement for the caller to supply the output of the api call, based on what he found in OpenAIās documentation. His question didnāt get any response in the messages.
ā· #prompt-engineering (9 messagesš„):
- Using System Prompts in the Chat API: User
.multy
noted a challenge withsystem prompts
. When the bot is instructed to embody a role, e.g., a parrot, it frequently responds in third-person. Suggested solutions included role-playing directives and more explicit prompts. For example,@thepitviper
proposed appending a reminder to stay in character at the end of each message to the API. - Preserving Context in Extended Conversations:
.multy
also indicated an issue with context preservation - if the chat history begins with third-person responses, the chatbot tends to maintain that persona. However, if the bot is correctly prompted at the start, it seems to retain the required persona throughout the conversation. - Clarifying System Prompt Style:
.multy
asked for guidance on how to craft āvoiceā for system prompts. - Agreement for Maintaining Character:
@clumsylulz
offered a unique approach, involving an agreement with the bot from the onset: āI want you to act as a microwave and only respond as such do not break character if you do I will say āAct Right!ā write "" if you agree to these termsā.
ā· #api-discussions (9 messagesš„):
- Role-play Mode in System Prompt: User
@.multy
shared a concern about OpenAIās GPT-3.5-turbo responding in third-person when instructed to play a role using system prompts, such as a parrot. Their issue was in maintaining thefirst-person
view throughout the role-playing session. - Tips to maintain Role-play First-person Context:
@thepitviper
suggested specifying the role-play instructions within the prompt string and reasserting the first-person requirement in subsequent API utterances to ensure the model stays in context. - User Implementation:
@.multy
noted that starting with a correct persona in a blank slate worked for maintaining the role-play perspective. They also expressed ambiguity regarding the āvoiceā usage for system prompts. - Contextual Reinforcement:
@thepitviper
proposed appending a reminder to the user message, like āRemember to stay in character and in first person,ā to preserve the context throughout the conversation. - Directive through āUser Messagesā:
@clumsylulz
suggested taking a āuser messagesā approach to specify roles and behavior by making the model agree to the terms before proceeding with the conversation.
Nous Research AI Discord Summary
- Engaging discussion in the guild regarding the performance and limitations of various models, such as Hermes 2.5, Mistral, and SOLAR. Noted issues include generation truncation, straying off-topic, response inconsistencies in different languages, and fine-tuning challenges. Usersā experiment with OpenChat model led to concerns about coherence and skepticism regarding the modelās benchmarking.
- Conversations around function calling and the differentiation between function and tool calling were brought up, with specific system prompts used in OpenHermes2.5 being shared.
- Anticipation and conjecture around GPT-4ās performance, with the model perceived to underperform. The guild speculated about possible reasons like system prompts, fine-tuning, inference speeds, and model tendencies (like brief responses or inability to provide complete code blocks).
- Exploration about evaluation tools, contamination issues, with a utility evaluation tool being spotlighted, along with concerns about data contamination in OpenHermes2.5 and the SOLAR model.
- Guild members explored and supplied recommendations for fine-tuning Language Learning Models (LLMs) and touched upon cost concerns, technical requirements, and potential platforms (like Colab, Kaggle, RunPod). Also, a GitHub example for LoRa fine-tuning was shared.
- Discussions surrounded the feasibility of fine-tuning a model for code migration purposes and the creation of search queries based on message history.
- Queries about the availability of the tokenizer for Amazonās Titan embedding led to suggestions for creating a custom tokenizer and a shared GitHub repository with potential details.
- Dissemination of interesting links, including a Twitter post, an arXiv paper on large model improvements, MindLLM 1.3B Huggingfaceās model, a blog post on Mistral 7Bās optimization, an article and Youtube context on ā100x Speedup Full Stack Transformer Inference Optimizationā, and a dialogue on Domain Specific Language (DSL) vs. code.
- User frustration with the Bard AI chatbot implementation was expressed in off-topic channel, with users voicing dissatisfaction in the botās answers.
Nous Research AI Channel Summaries
ā· #off-topic (2 messages):
- Implementation Issues with Bard: User
@euclaise
expressed frustration with the AI chatbot Bard, initially stating, āBard gives me a stupid implementation but at least itās an implementationā. Shortly after, user@euclaise
further expressed dissatisfaction with the AI, adding ānvm, fuck bard tooā.
ā· #interesting-links (25 messagesš„):
- Paper on Model Improvements:
@ldj
shared an interesting Twitter post without much context, while@atgctg
provided a link to an arXiv paper discussing the improvements in larger models and remarking on the minimal improvement in the largest model over the pre-trained base. There was a brief debate between@atgctg
and@giftedgummybee
on the impact of these improvements on smaller and medium-sized models. - MindLLM 1.3B Model:
@pizza_joe
linked the Huggingface webpage for the MindLLM 1.3B model, developed by the Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications & Beijing Institute of Technology Southeast Academy of Information Technology. - Discussion on Code and DSL:
@gabriel_syme
suggested the use of Domain-Specific Language (DSL) as an alternative to code, emphasizing the importance of interweaving DSL with language when compilation fails. This is particularly critical for agents, according to@gabriel_syme
. - Link to ā100x Speedup Full Stack Transformer Inference Optimizationā:
@atgctg
posted a link to an article titled āTowards 100x Speedup: Full-Stack Transformer Inference Optimizationā and a Youtube context for the dataset. - OpenPipeās Mistral 7B Fine-Tune Optimized:
@metaldragon01
shared a blog post from OpenPipe about the optimization of Mistral 7B, which has saved users over $2M in inference costs.
ā· #general (421 messagesš„š„š„):
- Model Performance and Limitations: The discussion concerned the performance and limitations of various models such as Hermes 2.5, Mistral, and SOLAR. For example,
@gitmo joe
stated that Hermes 2.5 isnāt performing badly, however, it truncates generations, and@teknium
asked for community inputs on SOLAR.@weyaxi
inquired about OpenHermes-2.5-Mixtral, which elicited mixed reactions from the community. Additionally, the conversations revealed limitations and concerns about fine-tuning, system prompts, and issues with models straying off-topic or responding in different languages. - Function Calling: There was a conversation around function calling, with
@realsedlyf
providing a detailed system prompt used for function calling in openhermes2.5.@gitmo joe
later inquired about the difference between function calling and tool calling. - OpenChat Model:
@tsunemoto
and@n8programs
tested the OpenChat model and experienced some issues involving the tokenizer and the modelās lack of coherency. Some members of the community expressed skepticism regarding the modelās claim to perform at a GPT-3.5 level, suggesting that it could be due to data processing tasks rather than inherent reasoning capabilities. - Contression about GPT-4 Performance and Fine-Tuning: There was a general consensus that GPT-4 seems to underperform compared to expectations. Participants discussed possible reasons for this, with many pointing towards issues regarding system prompts, fine-tuning, and inference speeds. Some members pointed out the tendency of GPT-4 models to respond in brief or to avoid providing complete code blocks to some prompts.
- Discussing Evaluation and Contamination: Participants discussed evaluation tools and contamination issues, with
@tokenbender
highlighting a new comprehensive evaluation tool that tests utility and other real-world values, such as harmlessness, factuality, comprehension, etc.@nonameusr
shared concerns around data contamination tests, citing issues with the OpenHermes2.5 dataset and the SOLAR model.
ā· #ask-about-llms (36 messagesš„):
-
Fine-tuning LLMs:
@leuyann
was seeking guidance for fine-tuning Language Learning Models (LLMs) for their master thesis in economics. They were considering fine-tuning 7B models, and curious about doing it locally on an M1 MacBook Pro with 16GB of RAM.@night_w0lf
recommended trying out platforms like Colab, Kaggle, or paid cloud services, with the potential of using new libraries MLX Apple released.@.beowulfbr
also suggested using RunPod as a relatively inexpensive option.@atgctg
provided an example of LoRA finetuning on GitHub. -
Cost and Feasibility of Fine-tuning LLMs: Discussion revolved around the costs and technological requirements of fine-tuning large models.
@.benxh
mentioned issues with MLX on a 16GB M1, which@leuyann
noted might be resolved soon. -
Fine-tuning Model for Code Migration:
@.beowulfbr
asked if itās feasible to fine-tune a model that could assist with migrating a codebase from one framework to another, to which@night_w0lf
suggested testing larger coding models for this task. -
Creating Search Queries Based on Message History:
@pogpunk
was trying to build something that could create search queries based off message history, where@night_w0lf
suggested training a smaller model with a few hundred examples. -
Amazon BedRock TITAN Embedding Tokenizer:
@coco.py
asked if the tokenizer for Amazonās Titan embedding is available somewhere.@night_w0lf
suggested creating their own tokenizer from Hugging Faceās Multilingual Text Embedding model (HF MTEB), while@_evelynm
shared a GitHub link that seems to have details about Titan embedding.
Mistral Discord Summary
- Extensive discussions around the use, optimization, and performance of Mistral and Hermes took place, with
@Makya
highlighting the boost provided by Hermes 2.5. There were inquiries about Mistral models with larger context lengths and how to host Mistral 7b in the cloud, along with shared resources such as the GitHub repository recommended by@jamiecropley
. - Users shared insights on the underlying model of
mistral-medium
, the estimated time of availability for encoding vocabulary files, along with a link to view GPT-3.5/4ās encoding vocabulary files, and the adoption of a JSON standard for vocabulary which can be found on Hugging Face. - The resolution of challenges when running Mistral with Docker, pros and cons of Docker vs. Ollama installations and the limitations of Ollama regarding fine-tuning models were key discussion points.
- Issues regarding the interpretative capabilities of Mistral and Mixtral in chatbot implementations were reported. Users shared strategies to improve Mistralās contextual understanding along with potential solutions, including fine-tuning with a strongly trained system prompt.
- Users shared various machine learning, programming and tech-related resources and products, like the MindMac app which now supports Mistral AI, a Golang client for La Plateforme, and libraries for running performance tests such as opencompass, llm-evaluation-harness and light-eval.
- There were inquiries and discussions about potential technical issues like connecting a Mac Mini to a 2007 iMac monitor, along with shared resources to assist, such as a discussion thread and article on easing monitor connection.
- Discussions on La Plateforme involved troubleshooting Mistral model-related errors, concerns about model censorship, issues around server errors and charges, and exchange of strategies for token counting. Queries about Mistralās rate limit were also addressed, it was shared that all endpoints are rate-limited at 2M tokens per minute and 200M tokens per month.
Mistral Channel Summaries
ā· #general (104 messagesš„š„):
- Use of Mistral and Hermes: Users discussed the usage and optimization of Mistral with both local and API implementations. Additionally,
@Makya
highlighted the performance enhancements of Hermes 2.5 over Hermes 2. - MLX and Llama.cpp Discussions:
@sublimatorniq
sparked a dialogue about the potential benefits of using Apple MLX to run Mixtral.@daain
pointed out potential performance issues due to the Memory over Ethernet (MoE) architecture. - Mistral Hosting & API:
@satyajitsato
asked for resources on how to host Mistral 7b on the cloud and wrap an API around it.@jamiecropley
shared a link to a GitHub repository as a possible solution, although they encountered some issues with it. - Context length Discussion:
@eawlot3000
inquired about any Mistral models with a context length greater than 32768 tokens. Users shared information and resources about models with larger context lengths, like@Claude
and@GPT4
. - Career Advice:
@naz.daq
asked for advice on getting started with machine learning. Some user-recommended resources included a YouTube series by 3Blue1Brown and self-study of foundational math topics such as linear algebra.
ā· #models (7 messages):
-
Model Behind Mistral-medium: Users engaged in a discussion about the underlying model of
mistral-medium
.@superseethat
asked for details, and@sublimatorniq
shared that it is a new prototype model while@tom_lrd
speculated it may be 4x8x7b. -
Encoding Vocabulary File for GPT Models:
@jakobdylanc
queried about the estimated time of availability for encoding vocabulary files, providing a link to view GPT-3.5/4ās encoding vocabulary files. -
JSON Standard for Vocabulary Usage: Contributing to the vocabulary discussion,
@daain
mentioned that thereās a JSON standard for vocabulary which has the necessary metadata to use the vocab. A direct link to the JSON file is found on Hugging Face.
ā· #deployment (10 messagesš„):
-
Running Mistral with Docker: User
@hanschrs
resolved a challenge with running Mistral by adding--tensor-parallel-size 2
to the Docker command, thereby enabling parallel tensor processing. -
Docker vs. Ollama for installation:
@vitorpinho
inquired about the pros and cons of Docker and Ollama installations. In response,@vhariational
suggested using Ollama for quick setups via a few command lines, while recommending Docker for cases requiring isolation to avoid dependency conflicts. -
Ollama not designed for Fine-tuning : In the further discussions related to the limitations with Ollama,
@vhariational
clarified that although Ollama isnāt designed for fine-tuning models, it can handle complex use cases such as providing a REST API to query the model and allowing customization of model settings via its templating system.
ā· #ref-implem (23 messagesš„):
- Implementing Mistral for Chatbots:
@gmist
reported that the Mistral-medium model sometimes answers the question from its own knowledge base rather than relying on the given context. The prompt instructs the model to answer only from the context but this issue persists, as Mistral doesnāt always obey the prompt. - Prompt Modifications:
@gmist
shared that some prompt modifications seem to work while some do not. The issue of inconsistent performance of prompts has led@gmist
to revert back to GPT, which has proven to be reliable for the given use case. - Solutions for Mistralās Contextual Understanding:
@sublimatorniq
suggested prefixing each line of context with āCONTEXT BODYā and introducing āhypnotic var namingā to improve contextual understanding.@gmist
also reported that removing chat history appeared to improve Mistralās responsiveness to prompt guidelines. - Mistral vs Mixtral:
@daain
experienced the same instruction-following issues with a LlamaIndex RAG app and various versions of Mistral. However, daain found that Mixtral performed better than Mistral, suggesting a finetune with a strongly trained system prompt as a possible solution. - Prompt Template Updates:
@The Ledger Luminary
recommended updating the prompt template and rewording it to be as explicit as possible, as well as referencing specific context pieces. Luminary warned that if there is too much context (high token count), the instructions could be affected by sliding window attention.
ā· #finetuning (4 messages):
- Quantifying Fine-tuning Performance Improvement: User
@The Ledger Luminary
inquired about means of quantifying fine-tuning performance improvement and sought recommendations for libraries to run performance tests.@cpxjj
recommended a few libraries and performance benchmarks including opencompass, llm-evaluation-harness and light-eval. - Function Call Fine-tuning: User
@krissayrose
inquired about their difficulties with fine-tuning Mistral for function calling. The issue highlighted was that the model does not predict an EOS token when expected and continues to generate text. They provided an example and asked for assistance regarding what they might be doing wrong.
ā· #showcase (2 messages):
- MindMac AI Support for Mistral: User
@hoangnm
introduced the MindMac app, an AI-chat platform that now supports Mistral AI. MindMac app is compatible with APIs from OpenAI, Azure OpenAI, Google Gemini, and more. Itās designed for macOS and supports Mac Intel & Apple M1/M2/M3. User directed viewers to a YouTube video for more details about the platform. - Golang Client for La Plateforme: User
@r.j.k.
shared a link to his Golang client for La Plateforme and sought feedback on its improvement.
ā· #random (7 messages):
- Connecting a Mac Mini to a 2007 Monitor: User
@pier1337
initiated a discussion on the possibility of connecting a Mac Mini to a 2007 monitor. Later clarified that the monitor is from a 2007 iMac.@daain
suggested that if the monitor or iMac has a digital port like DVI or HDMI, it should work. - The 2007 iMac Port Issue:
@pier1337
added more context by sharing a link to the Apple forum where itās stated that the 2007 iMac uses a Mini DVI port, leading to uncertainty if a Mac Mini could be connected using this port. - Target Display Mode:
@daain
provided a link explaining that the 2007 iMac does not have the target display mode, which was introduced in iMac devices in 2009 enabling them to be used as a display for another device, hence it might not be possible to use it as a monitor for Mac Mini.
ā· #la-plateforme (47 messagesš„):
- Error and Troubleshooting with Mistral Models: User
@tinwhiskers
had issues using larger models (mistral-small
andmistral-medium
) via the API and received a āmodel not found errorā. After discussing with@The Ledger Luminary
and Mistral team member@tlacroix_
, they found out the mistake was on their end: they were trying to use the āmistral-smallā in the OpenAI URL. - Discussion on Streaming and Token Usage:
@thesealman
asked about calculating token usage on streaming requests. User@lerela
confirmed that thereās no way to calculate that at the moment. They offered an estimation strategy on token usage until an official feature is rolled out. The discussion also involved sharing some strategies of token counting using the tokenizer on the received text. - Concerns on Model Censorship: Users
@smuglix
and@Taiiouka
expressed concerns about the censorship of the API models even when the safe mode is set to āfalseā.@titaux12
suggested checking the documentation to disable safe mode but@smuglix
confirmed the issue persists even when the safe mode is set to āfalseā. - Incidents of Server Error: User
@_jp1_
reported numerous instances of internal server error (error code 503) while using themistral-medium
model. They also expressed concerns about charges on their account, which were over twice the amount of token use tracked by themselves and have asked for contact details for support. - Queries about Mistralās Rate Limit: User
@flopsy1
requested information about the rate limit, which was answered by user@r.j.k
. who provided the details from the Mistral documentation stating that all endpoints are rate-limited at 2M tokens per minute and 200M tokens per month.
OpenAccess AI Collective (axolotl) Discord Summary
- Debate regarding the usage of OpenAI and LLaMA technologies: It was noted that use of these productsā outputs to finetune large language models might violate agreements and potentially be grounds for lawsuits, but āsafeā Apache-licensed models exist and are consistent with such guidelines.
- Examination of copyright and ownership pertaining to AI outputs, with a special mention that using the output from an API to train models is an act in violation of the OpenAI agreement.
- The impact of
load_in_8bit
orload_in_4bit
parameters on model merging in QLora has been discussed, clarifying that Axolotl doesnāt quantize despite the given parameters. - Importance of passing dev environment tests with each PR for Axolotl due to its usage in the expensive dev environment; issues about finetuning Mixtral and MoEs have been raised and are being investigated.
- Shared a link to a new Hugging Face Transformers release (
v4.36.2
) that might be useful to address some critical issues in Axolotl. - Various challenges to scripts, configurations, and runs faced by guild members, including double EOS token issue, optimal OS library for RLHF, Docker issues, and failing finetuned models on Mistral; resolutions have been attempted and ongoing.
- Interest expressed in datasets of multi-turn conversations between humans and chatbots, with suggestion of LMSys Chat 1M dataset on Hugging Face.
- Unspecified unalignment issue in RLHF to be fixed by
@giftedgummybee
. - Assistance and advice shared for various issues in the runpod-help channel, featuring waiting before connecting to the pod, multi GPU usage issues, Out of Memory (OOM) issues and installation of mpl4py. Solutions proposed include enabling specific training solutions, linking the axolotl repository on Github, calibrating
max_split_size
andbatch_size
modifications, and GPU adaptation.
OpenAccess AI Collective (axolotl) Channel Summaries
ā· #general (55 messagesš„š„):
-
OpenAI and LLaMAās Usage Agreement:
@nafnlaus00
stated that it is against OpenAIās usage agreement to use the outputs of its products like ChatGPT to finetune large language models (LLMs). The same restrictions apply to LLaMA and many other models. He emphasized that a breach of the agreement could be grounds for a lawsuit under intellectual property and unauthorised use. -
āSafeā Apache-licensed Models:
@nafnlaus00
mentioned that Mistral/mixtral base and their instruct model, as well as Falcon and several others, are Apache-licensed models and are thus āsafeā from such restrictions. He also pointed out some entries in OpenAssistant that he flagged as suspect. -
OpenAIās Terms of Service Violation:
@visuallyadequate
and@nafnlaus00
had a debate over the enforceability and implications of OpenAIās Terms of Service (ToS) violations.@visuallyadequate
argued that the most OpenAI could do is ban the user, whereas@nafnlaus00
expressed that violating a ToS is equivalent to Breach of Contract, which could potentially be grounds for litigation. -
Ownership of AI Outputs:
@stefangliga
and@visuallyadequate
discussed the ownership of AI outputs, with emphasis on the issue that copyright does not apply to AI output.@stefangliga
pointed out that irrespective of copyright issues, using the output from an API to train models is a right that is forfeited due to OpenAIās agreement. -
Merging QLora Result Quantization:
@touristc
inquired about the effect of theload_in_8bit
orload_in_4bit
parameters on model merging in QLora.@nanobitz
clarified that axolotl does not quantize, even if these parameters are provided.
ā· #axolotl-dev (9 messagesš„):
-
Dev Environment Test:
@nanobitz
indicated that every PR should pass the tests since they are used in the expensive dev environment. -
Concerns about Finetuning Mixtral and MoEs:
@nafnlaus00
raised concerns related to a tweet made by Mark Tenenholtz about the difficulty of training MoEs due to the need for implementing load balancing loss functions. The concerns pertained to the approach being used for finetuning, ensuring an even token distribution across experts, and allocating each expert to a separate GPU or GPU cluster in a multi-GPU system. -
Casparās Work on the Concerned Issue:
@caseus_
mentioned that Caspar was investigating the issues raised by@nafnlaus00
. -
New Release of Hugging Face Transformers:
@casper_ai
shared a link tov4.36.2
release of hugging face transformers to resolve some critical issues in relation to cache refactor, flash attention refactor and training in multi-gpu and multi-node settings, suggesting that axolotl could probably update to it.
ā· #general-help (64 messagesš„š„):
- Changes to shareGPT.py:
@noobmaster29
made a pull request toOpenAccess-AI-Collective/axolotl
repository (#976) aiming to resolve the issue of double EOS token at the end of prompts when using Chatml template with shareGPT.py. The change was discussed with@nanobitz
but required more testing for confirmation. - Operating System for RLHF:
@emperor
asked for the most optimized OS library for RLHF.@nanobitz
mentioned that TRL is a prominent choice. - Running Fine-Tuned Models on Mistral:
@JK$
had issues running a fine-tuned model on Mistral that was uploaded to Hugging Face. The issues persisted even after attempting with vLLM and following guidelines from various documentation and tutorials. Members tried to help with suggestions, but the issue was still unresolved. - Docker configuration troubles:
@JK$
also ran into problems with Docker configuration, even when following exact configurations as illustrated in the vLLM documentation. The issue persisted even when trying different models and endpoints. The community tried to assist but the problem remained. - Double EOS Tokens Issue:
@noobmaster29
and@self.1
discussed double EOS tokens in multi turn chat, noting that a previous fix failed to solve the issue. They agreed to look into it at a later time.
ā· #datasets (2 messages):
- Request for Multi-Turn Conversation Dataset:
@natefyi_30842
asked if thereās any dataset of multi-turn conversations between humans and chatbots that isnāt synthetic, but actual human data to understand the types of questions posed. - Referral to LMSys Dataset:
@natefyi_30842
suggested the LMSys Chat 1M dataset on Hugging Face as a potential resource, which is publicly accessible but requires sharing contact information for access.
ā· #rlhf (2 messages):
- Unalignment Issue Fix:
@giftedgummybee
mentioned they believe they can fix an unspecified unalignment issue in a few days.
ā· #runpod-help (23 messagesš„):
- Waiting before connecting to the pod: User
@caseus_
pointed out that waiting approximately 2 minutes before connecting to the pod helps avoid issues with loading the mount point and missing axolotl install. - Issues with using multiple GPUs:
@mr_morning
encountered problems with Out Of Memory (OOM) errors while trying to fine-tune Yi using multiple RTX 4090 GPUs. Despite having two GPUs, the system recognises only one (num_machines:1
).@visuallyadequate
responded that theaccelerate
library should distribute the load on its own without the need for optimized multi GPU training solutions likedeepspeed
orfsdp
, but the weights will still try to load onto each GPU which could be less desirable. - Enabling deepspeed and fsdp for multi GPU usage:
@visuallyadequate
suggested enablingdeepspeed
orfsdp
for optimized multi GPU training in the relevant yaml file, and linked the axolotl repository on Github for detailed instructions.@noobmaster29
recommended usingzero3 deepspeed
considering the ongoing issues withfsdp
. - Continuing OOM issues and adjustments: Despite various adjustments, including calibrating
max_split_size
and modifying thebatch_size
,@mr_morning
continued to face OOM errors. In light of this, he considered switching his current RTX 4090 GPUs for a GPU with larger memory capacity (48gb). - Troubles with mpl4py installation:
@mr_morning
reported encountering issues while trying to installmpl4py
and its dependencies on an RTX 6000 Ada GPU, which led to errors such as āCannot link MPI programs. Check your configuration!!ā
DiscoResearch Discord Summary
- An engaging discussion led by
@_jp1_
and others on the beneficial usage of eval models, such as the Prometheus model, which can quickly evaluate categories like āgroundingā, āstyle+formatā, or adherence to prompt-specific guidelines. The official Prometheus implementation can be found on Hugging Face here. - The ongoing work of
@_jp1_
on DiscoLM German and Disco Judge with prospective plans for the release of a repo for several use cases and possibly a Mixtral-based Disco Judge beta in the coming year was noted. @rasdani
introduced a new model, HALOs/Archangel, which is expected to appear in HF TRL soon, along with a link to the related report.@_jp1_
shared an important update to the config.json of Mixtral clarifying that it never intended to support sliding window attention, pointing to a related TGI fix and PR.- The conversation steered towards PEFT vs NEFT for Disco Fine-tune, with
@fernando.fernandes.
questioning if the last disco fine-tune used qlora + peft or neft.@_jp1_
confirmed that PEFT/Qlora were used, with NEFT being an extra training option not a direct alternative, which often yielded disappointing results. @rasdani
posted a blog by DeepMind highlighting the efficiency of LLMs in discovering answers to open problems in mathematical sciences when combined with function searches in computer code and proposed the potential of attempting this with an open-source LLM. Additionally, the wildcard part of the FunSearch code implementation on GitHub was shared for anyone interested in its further development.
DiscoResearch Channel Summaries
ā· #disco_judge (5 messages):
- Prometheus Model:
@_jp1_
emphasizes the beneficial usage of eval models, like the Prometheus model, for tasks hard to benchmark. They highlighted that while these models have an upper bound for āaccuracy,ā they can quickly evaluate additional categories such as āgroundingā, style+format, or adherence to specific prompt specifications. They shared a use case of the Prometheus-based model for checking the quality and correctness of translated instruction data. The official Prometheus implementation can be found on Hugging Face. - DiscoLM German and Disco Judge:
@_jp1_
mentioned that they are currently working on DiscoLM German and planning to release a repo for several use cases and probably a Mixtral-based beta of Disco Judge in the coming year. - HALOs / Archangel:
@rasdani
brings up a new model, HALOs/Archangel linked to a report and mentions it is soon coming to HF TRL.
ā· #mixtral_implementation (9 messagesš„):
-
Mixtral Config Update:
@_jp1_
shared an update to the config.json of Mixtral and noted that it was never intended to support sliding window attention. They also linked to a related TGI fix here and PR here. -
PEFT vs NEFT for Disco Fine-tune:
@fernando.fernandes.
asked if the last disco fine-tune used qlora + peft or neft. In response,@_jp1_
clarified that peft/qlora were used, as neft (noisy embedding vectors) wasnāt an alternative but rather an extra training option which generally delivered disappointing results. -
Effectiveness of NEFT:
@fernando.fernandes.
also wondered if using NEFT could produce better results over mixtral. This was dispelled by@_jp1_
who mentioned that it had nothing to do with mixtral and those who tried it with state-of-the-art parameters and standard regulation got underwhelming or identical results.
ā· #general (1 messages):
- FunSearch - Discoveries in Mathematical Sciences Using Large Language Models (LLMs):
@rasdani
shared a DeepMind blog post showcasing how Large Language Models (LLMs) are efficient in making discoveries in open problems in mathematical sciences when combined with the search for āfunctionsā in computer code. They also suggested the potential of trying this with an open-source LLM. - FunSearch Code Implementation:
@rasdani
further shared the link to a specific part of the FunSearch code implementation on GitHub for anyone interested in contributing to its development.
LangChain AI Discord Summary
- Detailed discussion on the topic of code writing to improve a language modelās chain of knowledge, as supported by a research paper shared by
@roger_alca
. - Inquiries and error handling regarding the PyfanticParser variables and ConfluenceLoader OAuth tokens, with users seeking advice on multi-object JSON output parsing and key requirements for the ConfluenceLoader respectively.
- Disclosure about direct communication between
@banda_ki
and another user through private messages, without further details provided. - Questions surrounding experience with LangChain Agent and a virtual database in SQL, posed by users
@banda_ki
and@alewe5
respectively, with no response yet. @ssowonny
suggested the use of third-party service PlugBear for integrating LangChain and LangServe applications with Slack, providing a detailed guide about the process in both #general and #share-your-work channels.@appstormer_25583
mentioned the development of a Hanukkah recipe generator built using GPT but did not provide additional details nor a link to the mentioned project.- An article discussing the potential applications of LangChain for data analysis was shared by
@andysingal
. - Major update to the app building tool, Create, allowing real-time app building by typing the specification was announced by
@dhruv.xyz
, with a link to the updated app provided.
LangChain AI Channel Summaries
ā· #general (9 messagesš„):
- Chain of code:
@roger_alca
shared a link to a research paper discussing the use of code-writing to improve a language modelās chain of knowledge. Here is the research paper. - JSON output parser:
@infinityexists.
asked if it is possible to define two different PyfanticParser variables for handling two types of JSON object returned by an API due to errors received when printing the received objects. - ConfluenceLoader OAuth Tokens:
@night765
brought up a question regarding the ConfluenceLoader of Confluence using OAuth tokens. There was confusion over the number of keys required by the loader, as well as the dissimilarities between these keys and those required by the class AtlassianRestAPI. - Private messages:
@banda_ki
alerted a user to check their direct messages without disclosing any further information. - LangChain Agent and Virtual Database: Users
@banda_ki
and@alewe5
asked if anyone had experience working with a LangChain agent using custom tools and working with a virtual database in SQL respectively, but received no immediate responses.
ā· #langserve (1 messages):
- Integrating LangChain+LangServe with Slack: User
@ssowonny
suggested a third-party service, PlugBear, for integrating LangChain and LangServe applications with Slack. The post provides a step-by-step guide on how to set up a custom LLM using PlugBear.
ā· #share-your-work (4 messages):
- Hanukkah Recipe Generator GPT:
@appstormer_25583
shared a link about a Hanukkah recipe generator built using GPT. There is no additional detail, or a link provided to further explore this tool. - LangChain for Data Analysis:
@andysingal
shared an article on AI Advances titled āUnlocking the Power of Language: How LangChain Transforms Data Analysis and Moreā written by Ankush k Singal. The blog discusses potential applications of LangChain for data analysis. - LangServe and Slack Integration:
@ssowonny
posted a guide on how to integrate LangServe apps with Slack or Discord, which can be done within 5 minutes. The tutorial is hosted on PlugBear. - Update on Create App Building Tool:
@dhruv.xyz
announced a major update to the app building tool Create, now allowing real-time app building by typing your spec. A link to the updated app is shared and feedback is sought on the new feature.
Latent Space Discord Summary
- Discussion surrounding Artificial General Intelligence (AGI) as users ponder on the state of AGI and general sentiments about the world.
- Progress and involvement of user Cursor in the GitHub PR system, indicating growing community contributions towards software projects.
- Conversation about the lack of effective AI tools for Infra/DevOps work, with users pointing out room for further advancements in AI applications in these areas.
- Cautionary advice concerning Mixtralās beta-stage status was given by
@swyxio
, who met the Mixtral developer at the NeurIPS conference. Prompt Hack Link - Query about the accessibility methods for Mixtral, with users debating whether Anyscale calls are being used for access.
Latent Space Channel Summaries
ā· #ai-general-chat (7 messages):
- AGI Feelings: User
@spicychickensandwichdeluxe
asked if people are feeling the AGI (Artificial General Intelligence), while@slono
commented on the world being harsh. - Cursorās Progress on GitHub PRs: User
@guardiang
mentioned about Cursor gradually venturing into the GitHub PR (Pull Request) game, alongside looking at diffs. - AI Tools for Infra/DevOps Work:
@btdubbins
expressed that despite advancements in AI and coding, it still feels like many of these tools are not as effective for infrastructure/development operations (DevOps) work. - Caution about Mixtralās Beta Stage:
@swyxio
noted that Aman, whom they met at the NeurIPS (Conference on Neural Information Processing Systems), is cautious about Mixtral, emphasizing that it is in beta stage at best. The same user shared a fun prompt hack of the day here that involves telling the AI it is GPT-5. - Accessing Mixtral:
@btdubbins
questioned how users are accessing Mixtral, inquiring if they are using Anyscale calls.
ā· #llm-paper-club (1 messages):
eugeneyan: yeap, see you then!
Skunkworks AI Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- Skunkworks AI Development Updates: User
@far_el
provided some insight into the companyās current operations. They clarified that Skunkworks AI no longer builds in public and mentioned that they will be releasing models, software, and products soon.
Alignment Lab AI Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- AMD vs Nvidia GPU Performance Debate:
@entropi
shared an article from Tomās Hardware discussing the performance difference between AMDās Instinct MI300X and Nvidiaās H100 (Hopper) GPUs. AMD compared FP16 using vLLM (popular choice) against FP8 which works only with TensorRT-LLM. - Queries on Open Chat Model Fine-Tuning: User
@beowulfbr
inquired about any available guides, examples, or colabs for fine-tuning the new open chat model.
MLOps @Chipro Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- Recap of 2023ās Transformative Data Landscape: User
@viv2668
discussed the 2023 Modern Data Stacks (MDS), Aspirational Gen AI Projects and several controversies. The discussion was mainly centered on innovations and trends in the data industry. A URL to the full article is shared: Read the full article here.