The Notion page has research directions to explore in more detail, with many good papers.
[TOC]
OpenAI Discord Summary
- Introduction of the new OpenAI Developers handle on Twitter under the username @OpenAIDevs. The online community members were encouraged to follow this account for updates. Twitter link.
- Discussions around the utilization differences between Gemini vs Bard chatbots from OpenAI, with affirmation that both bots can read each otherās instructions and provide newer, unique perspectives on solving coding problems.
- Various issues with OpenAIās services were highlighted, from problems with OpenAIās website, to features like archiving and phishing captcha. Also included were ongoing conversations around GPT customization and dissatisfaction about the business relationship between OpenAI and Microsoft.
- A heated debate existed regarding the performance and function concerns of GPT (OpenAI). Users reported various problems with GPT-4, such as unfriendly responses and inaccessibility, and issues with its capacity. Furthermore, there were speculations on the potential release of an updated version, GPT-4.5.
- Users shared their experiences with custom GPT-4 difficulties, their struggles with PDF file attachments, and potentials of using Dalle 3 in GPT for style imitation. An announcement revealed that a bug with image recognition was resolved.
- People discussed issues around prompt engineering, like struggles with getting the AI to adhere to detailed guidelines for modifying code and difficulties in getting the AI to rewrite original text into a format suitable for reading aloud. Introduction of an enhancement in Convo-lang with JSON mode support was also noted.
- In the API discussions, there were further elaborations on newly introduced JSON Mode support in Convo-lang, necessary rules for producing Venn diagrams, issues with GPT-4 in modifying large scripts without stripping off the comments and introduction of placeholder comments, and challenges in making human-written text sound more natural for TTS in informative ātalking headā videos, especially in the medical field.
OpenAI Channel Summaries
ā· #annnouncements (1 messages):
- Announcement of New OpenAI Developers Twitter Handle:
@abdubs
introduced the new OpenAI Developers handle on Twitter and encouraged users to follow for updates. The handle can be accessed via this link.
ā· #ai-discussions (91 messagesš„š„):
-
Gemini vs Bard: User
@offline
discusses their experience with using the Gemini and Bard chatbots from OpenAI. They mention that the bots can read each otherās instructions, providing unique perspectives on coding issues, and giving newer information than what GPT provides. User@muyfashionista
shares they have not noticed significant differences between these bots. -
Use of AIs for Illegal activities: User
@sylviajade
raises a concern about chatGPT being used to generate malware, wondering if any tracking or monitoring of such uses is in place. Respondent@lugui
clarifies that violating the Terms of Service (ToS), which includes illegal activities, can result in account bans on OpenAI. -
Customizing GPTs: Users
@rudish
,@elektronisade
, and@aznironman
discuss on how to customize GPT for various purposes, including use in Discord bots and for creating an offline chatbot.@webhead
recommends using Llama2 based models for most cases due to its good licensing conditions and availability of various models. -
Concerns with GPTs and Alternatives: Users
@stevenramenby
and@jerrybobs
share their dissatisfaction with GPTs, citing issues with rambling words and plugin failures.@jerrybobs
suggests looking into AI Assistants, which are harder to set up but purportedly more reliable. -
Perplexity chatbot: User
@chief_executive
shares their positive experience with the Perplexity chatbot, commending its real-time browsing capability for information gathering. Comparatively, they find Perplexity superior to both ChatGPTās browsing tool and Bing AI. They also mention that Perplexity provides 600 uses a day under its Plus version for $20 a month.
ā· #openai-chatter (544 messagesš„š„š„):
- Performance issues and concerns with GPT-4: Users
@koll0212
,@stevenramenby
and@lindeberg02
reported experiencing issues with GPT-4, mentioning it as āacting up,ā giving random words and emojis as response, inability to use it, and even cancelling subscription due to these problems.@stevenramenby
shared several instances of the issues in chat transcripts, which were discussed with users@offline
,@rjkmelb
, and@solbus
. The issue seemed to persist across custom agents for@stevenramenby
, prompting further help requests in the specified channels. - ChatGPT Down Experience: Users
@zappymonster
,@7877
,@andrew.lol
,@gardener_windson
,@acaba1806
,@bricky___
,@smokzz
and@thepitviper
reported experiencing a downtime with the ChatGPT platform. They described it as returning 502 errors, timing out, or being unable to open the site. - Usage Cap and User Interface Concerns: Users
@lindeberg02
,@openheroes
,@solbus
and@kyoei
discussed the usage cap on GPT-4 for ChatGPT Plus.@solbus
clarified that the cap is currently 40/3hrs for the base GPT-4 usage and custom GPTs are limited to 25/3hrs.@lindeberg02
complained about restrictions after payment and sought more information about the cap. - Expectation and Speculation on GPT-4.5 Release: Users
@realmunknown
,@DawidM
,@realgavok
,@mrcrack_
, and@lugui
engaged in discussions and speculation about the potential release of an update, specifically GPT-4.5. These were based on rumors and leaks, with@DawidM
noting that a user on Reddit was a source of information about the potential model update. - GPT Performance and Function Concerns: Users
@skrrt8227
,@you.wish
,@afterst0rm
,@kyper
, and@bad3r
voiced criticisms about GPTās perceived limitations, including an inability to effectively format text, a lack of consistency, and the impact of a usage cap.@bad3r
specifically criticized the server capacity of the platform, arguing that ChatGPT Plus users should be allocated more bandwidth.
ā· #openai-questions (251 messagesš„š„):
- Issues with OpenAI Services: Several users reported issues with OpenAIās services, from problems with OpenAIās website
@jcrabtree410
,@mohammad_soqar101
,@l3xp3rt
,@kamikolart
,@dsssssssss
and@cosmosfractal
, to specific features and functions such as archiving@nickiee
and@xhls
and phishing captcha pop up despite being logged in from a familiar device@vassiou
. - Chatbot Discussions: Multiple users sought assistance or had conversations around GPT usage and customization. Users such as
@stevenramenby
,@bionicle1337
, and.cybersenpai
discussed issues and questions around custom GPTs, plugin behavior, and fine-tuning. Got several clarifications from@elektronisade
and@satanhashtag
amongst others. - Disputes Around OpenAI and Microsoft: User
@bionicle1337
expressed dissatisfaction about the business relationship between OpenAI and Microsoft, expressing beliefs of unfair business practices and antitrust violations. Some community members like@elektronisade
and@Rock
provided different perspectives but the discussions remained unresolved. - Product Features and Improvements Discussions:
@stevenramenby
reported anomalies with custom GPTās output,@jtensetti
discussed the name toggle issue with Plus subscription. In line with this,.staycurious
inquired about setting up a feature for automatic motivation emails, and@jah777
asked about integrating a ChatGPT bot on Discord. Relevant suggestions were provided by other members. - Multiple Language Discussions: Various users were engaging in conversations in multiple languages, being reminded by
@satanhashtag
to use English for broader communication.
ā· #gpt-4-discussions (16 messagesš„):
- Chat Limit Issues: User
@zeriouszhit
expressed concerns about facing a chat limit even when encountering network errors or bugs on OpenAIās end, describing a situation where they were unable to process images and were penalized with chat limit reductions. - Image Recognition Bug Fix:
@solbus
mentioned that an image recognition bug has been resolved according to reports on a specific Discord thread. - Custom GPT Disappearances:
@ps5671
and@pietman
talked about issues with custom GPTs disappearing or not being editable, which seem to have resolved eventually. - GPT Ability to Generate Detailed Technical Analysis Charts:
@consciousness3783
inquired about GPT-4ās capabilities in generating technical analysis charts with details like price action indicators through images and text within a single prompt. - Issues with PDF Files:
@jazy_87295
encountered difficulties when trying to attach a PDF file in GPT, wondering if a conversion to the TXT format is necessary. - Style Imitation with Dalle 3:
@xchronos
shared an idea of using Dalle 3 in GPT to create images that imitate the style of other provided images.
ā· #prompt-engineering (22 messagesš„):
- Convo-Lang Update: User
@iyioio
reported an enhancement in Convo-lang, with JSON mode support now available. They provided detailed examples of how to define structs and utilize JSON for various data formats, including a discussion of using it with Vision and function calling. Find more details in the NPM docs for Convo-lang. - Instructions for Chatbots: User
@offline
suggested the strategy of prefixing instructions with numbers for easy referencing, along with emphasizing that the bot should only produce VENN diagrams with 3 circles unless otherwise stated by the user. - Script Modification Guidelines:
@joeyjoejoeshabadoo42
is experiencing difficulties in getting the AI to adhere to their detailed guidelines for modifying code, specifically when the code size reaches or exceeds 110 lines. The rules include preserving all code comments, modifying as minimal code as possible, clear explanation of modifications, honoring the codeās history, and always providing a full modified script without placeholder comments. - Natural Language Processing:
@.ursium
expressed difficulty in getting the AI to rewrite original text into a format suitable for reading aloud, in a specific context of creating medical videos. They highlighted issues where the AI changes nuances in the text, sometimes affecting the meaning of important information. - NLP responses:
@world_designer
suggested informing the AI about the context and nature of the script for a more accurate and nuanced understanding, which@.ursium
agreed to try.
ā· #api-discussions (22 messagesš„):
-
JSON Mode support in Convo-lang:
@iyioio
introduced JSON Mode support in Convo-lang with examples demonstrating application in both simple and complex scenarios, such as listing planets in the solar system and describing people in an image. More details can be found in the NPM documentation for convo-lang. -
Rule for Producing Venn Diagrams:
@offline
stated a clear rule that Venn diagrams produced should only have 3 circles unless the user specifies otherwise. -
Custom Instructions for Code Modifications:
@joeyjoejoeshabadoo42
has been facing issues with large python scripts when using custom instructions in the context of GPT-4 via the web chat UI. The AI fails to adhere to instructions, especially when the script size reaches around 110 lines. It gives placeholder comments and strips code comments. -
Making Text Sound Natural for āTalking Headā Videos:
@.ursium
is trying to make human-written text sound more natural for TTS in informative ātalking headā videos, specifically within the medical field. The issue is that when asked to simplify the text or remove difficult to pronounce words, the text is rewritten and critical words such as āmayā or āsometimesā are replaced or removed, changing the meaning. -
TTS AI Model Vs ChatGPT: In the conversation between
@.ursium
and@world_designer
,@world_designer
initially proposed using a separate TTS AI model for@.ursium
ās problem. However, after@.ursium
clarified that the issue involves adjusting written text for spoken presentation and not generating TTS,@world_designer
suggested explaining the script context to ChatGPT.
OpenAccess AI Collective (axolotl) Discord Summary
- Extensive dialogue on AI performance and training strategies involving models like PHI-2 and A6000s. Critical conversations included potential overalignment of AI models, strategies for securing free computational power, handling CUDA out of memory issues using techniques like reducing micro batch size, and contemplating on the risk of overtraining during fine-tuning processes.
- Deep dive into DPO training was initiated by
@faldore
where he shared a link to a RL/DPO pull request #935 on the OpenAccess AI Collective Axolotl repository. - Significant discussions around template and token configurations, for instance the thread on logic behind LLAMA2 SeparatorStyle which seemed to skip index 0 in messages dropping any instructions and the issue with unexpected space additions linked to the Hugging Face tokenizer opened by
@Aleksa GordiÄ
. - Users also shared their exploration on handling end-of-sentence (EOS) token generation issues, proper load time for Qlora with ChatML template during inference stage, and finding the right configuration format for training models.
- The suggestions and fixes for the aforementioned challenges were often accompanied with Git pull requests, such as PR #952 submitted by
@hamelh
to rectify the LLAMA templating concern. - The guild had detailed conversation on datasets, discussing topics like the token to TB ratio of models, hardware requirements, and duration for model training. There was also a query on function calling datasets with Glaive Function Calling v2 dataset being shared as a potential resource.
OpenAccess AI Collective (axolotl) Channel Summaries
ā· #general (85 messagesš„š„):
- AI Overalignment and Performance:
@noobmaster29
and@faldore
discussed how their AI models might be overaligned.@faldore
shared a humorous system prompt he used for his AI model and the group discussed the potential consequences if an AI were to realize the incentivises were false. - AI Training:
@faldore
discussed his plans to train a phi2 model.@le_mess
asked for advice on getting free compute, to which@faldore
suggested applying for Microsoftās Level 3 startup programme. - AI Model Fine-tuning and OOM issues:
@mihai4256
experienced CUDA out of memory issues while using deepspeed zero3 to fine-tune a Yi model with 8 x A100 (that is 8 * 80 GB),@nruaif
suggested reducing micro batch size and increasing accumulation steps.@dangfutures
also encountered out of memory issues while fine-tuning the full model. - DPO Training and Git Repo:The users discussed DPO training with
@faldore
sharing a link to a RL/DPO pull request #935 on the OpenAccess AI Collective Axolotl repository. - Troubleshooting Issues:
@matts9903
and@dangfutures
reported issues when saving checkpoints and after one epoch of model training. They received some troubleshooting suggestions from@caseus_
and@dangfutures
.
ā· #axolotl-dev (111 messagesš„š„):
-
Fine-tuning and Overfitting: Users
@caseus_
,@faldore
,@noobmaster29
, and@nruaif
were discussing about how much fine-tuning and Quick Language-learning Objective with Reinforcement Actions (Qlora) to apply on A6000s and PHI-2 models, with the topic raising concerns about overtraining. Users suggested both full fine-tuning and Qlora first, highlighting that adjustments can be made if the model begins to overfit. -
Optimization Concerns on Mixtral: User
@casper_ai
shared his unsuccessful experience attempting to work with optimization suggestions from unsloth.ai. He is looking forward to their potential pull request (PR) to check if improvements can indeed be achieved, especially with Mixtral models where memory is crucial. -
Llama2 Templating Issue: User
@Aleksa GordiÄ
brought up an issue regarding the logic behindairoboros
,llama-2
, and others relying onSeparatorStyle.LLAMA2
, as it appears to skip index 0 in the messages thus dropping any instructions. He proposed a solution where index 0 should be yielded together with the system prompt. This discussion further evolved into a discussion on more robust solutions, test cases, and other Lama2 related issues like prompt assembly. -
Fix for Llama Templating Issue: User
@caseus_
highlighted a PR opened by@hamelh
to fix the Llama templating issue. The PR #952 addresses errors in the EOS/BOS application amongst other issues. -
Token Space and Decoding Issues: User
@Aleksa GordiÄ
identified an issue regarding unanticipated space additions linked to the Hugging Face tokenizer. It sparked a discussion about token manipulation complexities, with@hamelh
suggesting using Hugging Face chat templates as the āsource of truthā to resolve the issues.
ā· #general-help (79 messagesš„š„):
-
Discussion on Special Tokens and Configuration: User
@noobmaster29
discussed a challenge they were having with their models not generating an end-of-sentence (EOS) token, causing their model to make continuous outputs up to the token limit. They shared various configuration settings they had tried, including different methods for setting the EOS token, and asked other users for advice. -
Running Qlora with ChatML Template: Regarding the use of Qlora with a ChatML template,
@noobmaster29
expressed confusion about when to load the ChatML template during the inference stage.@caseus_
confirmed that the template should be loaded in that format. -
Configuration Format for Training Models: There were also discussions on what kind of configuration format should be used when training models.
@noobmaster29
and@self.1
were not sure about the correct token configurations for running prompts with the ChatML format. -
Inference Issues: In addition,
@noobmaster29
reported that when they used a particular conversational setting during inference, the model didnāt adhere to the ChatML format.@nanobitz
suggested checking the full output using thepreprocess
CLI with--debug
. -
Links of Interest:
@noobmaster29
shared a link to a Hugging Face model with a configuration that worked well with ChatML, but upon trying to replicate it, they didnāt achieve the same results.
ā· #datasets (33 messagesš„):
- Dataset sizes and training metrics discussions: A detailed discussion happened between users
@natefyi_30842
,@noobmaster29
and@nanobitz
. They discussed token to TB ratio of models, the duration and hardware required for training models, with references to PHI-2 and OPT models. User@natefyi_30842
clarified misunderstanding about the time and GPUs required for training different models (PHI-2 card, OPT-125m card). - Training smaller models: User
@noobmaster29
suggested that models in the hundred million token range might be feasible for individual efforts, linking to the OPT model on huggingface. - Enquiry about Function Calling Dataset: User
@le_mess
asked for recommendations on function calling datasets, and shared the Glaive Function Calling v2 dataset and a leaderboard for function calling.@bjoernp
showed interest in this as well.
Nous Research AI Discord Summary
- In-depth discussions and speculation focusing on AI model training strategies, and handling of datasets within the guild. Specifically, extensive conversation centered around potential contamination of the MetaMath dataset, with proposals for an automatic contamination check, despite potential cost concerns. The focus also extended to strategies such as Neural Architecture Search (NAS) over merge parameters, in place of an excessive concentration on model training for benchmarks
- Chatter around upcoming models, including speculative user anticipation over the capabilities of Llama 3 and Galactica 2.0. A particular focus on coding-specialized models like Dolphin 2.5 Mixtral 8x7b was evident, with acknowledges for its proficiency in coding.
- Discussion on creating a decoupled and continuously improving world model, as voiced by
@maxwellandrews
, and a contemplation of its high-level approach. - A shared interest in high-performance workstation setups for purposes beyond gaming, such as 3D rendering and machine learning. Specific link shares included a high-performance workstation named āBig Bossā and a discussion regarding NixOS and Python packages management with Nix on the NixOS forums.
- Probing dialogues on AI models and tokenization, including Based model architecture, MorphPiece as a potential BPE replacement, and the Phi-2 performance. A clear disappointment with Phi-2 was expressed for not patching holes left by the previous Phi-1/1.5 models. Links shared involved a tweet on Based, a Paper on MorphPiece, a research paper by OpenAI, and a blog post on FunSearch.
- Model performance updates from users, with reports on the average score of the Solar Model from
@gabriel_syme
, who later suggested fine-tuning and slurping the Solar model, and the workings of the RedPajama 3B model albeit at a slower pace from@bevvy
. - Critical comments on the pricing of GPT4-Turbo, with the general consensus categorizing the costs as overly high even for high-value tasks.
- Queries and pointers raised in the Ask-about-llms channel relating to Solar 10ās preset, alternatives to MLC with a suggestion for Android/iOS setup, and a reference to LLM Farm.
- Lastly, a professional tone of discussion was flavored with personal updates from
@gezegen
on recovering from flu and missing some conversations.
Nous Research AI Channel Summaries
ā· #ctx-length-research (1 messages):
- Decoupled and Continuously Improving World Model: User
@maxwellandrews
is contemplating on a high-level approach to create a decoupled and continuously improving world model.
ā· #off-topic (27 messagesš„):
- Graphics Card Discussion:
@everyoneisgross
shared some issues with their graphics card, specifically mentioning black elements and checker renders. They have considered the limitations of their Colorful 3060 card. - High-performance Workstation:
@skadeskoten
shared a link to ExtremeTech about a high-performance workstation from Germany named āBig Bossā, which has 7 RTX 4090 GPUs and a 64-core Threadripper Pro 5995WX. The workstation is designed for 3D rendering and machine learning rather than gaming. - Model Finetuning:
@euclaise
stated their GPT-7b models can be finetuned on a single GeForce 3090 using their custom optimizer. They speculated that the Mixtral model would potentially fit on the āBig Bossā workstation setup too. - NixOS and Nix Package Management:
@euclaise
asked if anyone in the channel uses NixOS and manages Python packages with Nix. They also shared a link to a discussion on the NixOS forums. - Training Models for Specific Language:
@skadeskoten
expressed a desire to train language models for Norwegian, specifically medical Norwegian.@euclaise
responded that it would take a long time to complete such a task.
ā· #benchmarks-log (2 messages):
- Solar Model Performance:
@gabriel_syme
reported that the Solar model holds an average of 74, though no specification was provided about what this metric measures. - Finetuning and Slurping Solar Model: In a subsequent message,
@gabriel_syme
suggested the need to finetune and slerp the Solar model.
ā· #interesting-links (36 messagesš„):
- Discussion on Phi-2 Performance: User
@bevvy
and@georgejrjrjr
discussed the performance of Phi-2.@georgejrjrjr
expressed disappointment that Phi-2 did not patch holes left by the previous Phi-1/1.5 models and speculates it might actually be a case of Phi-CTNL. - Based Model Architecture & MorphPiece Tokenization: There was a discussion on new model architectures including Based, with a link shared by
@nods
to a tweet by Simran Arora detailing its simplicity and efficiency.@georgejrjrjr
noted the rapid advent of new model architectures, also bringing up MorphPiece as a potential BPE (Byte Pair Encoding) replacement. He shared a Paper Link on the subject. - Fine-tuning Dataset Creation: User
@atgctg
brought up the topic of creating instruction datasets for fine-tuning as an important parallel task to pre-training data curation. - Pricing of GPT4-Turbo: A discussion led by
@gabriel_syme
and@giftedgummybee
critiqued the high pricing of GPT4-Turbo. The prices were deemed improper even for high-value tasks. - Weak to Strong Generalization Paper & FunSearch:
@giftedgummybee
shared a link to a research paper by OpenAI, and@coffeebean6887
shared the GitHub repository with the code. A blog post on FunSearch was shared by@nods
, and sparked a discussion among the users regarding its application and novelty.
ā· #general (139 messagesš„š„):
-
Discussion on MetaMath Dataset: There was significant debate regarding the potential contamination of the MetaMath dataset.
@bjoernp
mentioned the importance of avoiding contaminated data in model training, and@mihai4256
confirmed potential contamination after running a test, scoring 0.96 where anything above 0.85 is deemed as a high likelihood of contamination. There was also discussion on automatically checking for contamination when models are submitted, but potential costs were raised as a concern. -
AI Model Training: Various users discussed model training strategies, with
@nonameusr
expressing concern over too much focus being placed on model training for benchmarks.@euclaise
suggested employing Neural Architecture Search (NAS) over merge parameters, prompting further discussion on model merging. -
Upcoming Models: Users such as
@Error.PDF
and@gabriel_syme
speculated about the potential release and capabilities of future models like Llama 3 and Galactica 2.0. However, no specific release dates or features were confirmed. -
Focus on Coding: Models specialized in coding were discussed by users
@nonameusr
and@metaldragon01
, focusing on the recently released Dolphin 2.5 Mixtral 8x7b model which claims to be very good at coding. -
Model Execution: Discussion around model execution included mention of QuIP models by
.beowulfbr
and mentions of how to run inference on these models effectively.@decruz
mentioned running Openhermes on a phone.
Relevant resources and links included in discussion:
ā· #ask-about-llms (6 messages):
- RedPajama 3B Performance: User
@bevvy
reported that the RedPajama 3B was working, albeit at a slow pace. The prefill was slow and the system ran at 3 tokens/s on a pixel 7. - Question about Solar 10 Preset: User
@agcobra1
queried about the preset used by Solar 10. No responses were provided within the given message history. - Alternative to MLC: User
@gezegen
asked if there was a possible alternative to using MLC and suggested setting up on Android/iOS. - Reference to LLM Farm: User
@orabazes
mentioned LLM Farm, though the context wasnāt clear from the given message history. - User Illness: User
@gezegen
noted that they were recovering from the flu and may have missed some previous conversations.
HuggingFace Discord Discord Summary
-
NeurIPS Meetup Outreach & LSTM Discussion: User
@.yosun
extending an invite for the last-minute attendees to a NeurIPS meetup and an AI3D breakfast. Additionally,@vipitis
elucidated on working with LSTM in the conversation and shared link to PyTorch LSTM documentation. -
Chatbots & VSCode Plugins Talk: A brief discussion on customizing the AI name in LLAMA chatbots carried out by users
@criticaldevx
,@jeffry4754
,@shrap42
,@ahmad3794
and@funapple
. Tips included fine-tuning and replacing the name post-generation using a script. -
Deep Dive into RNNs, LSTMs & GRUs: User
@nerdimo
provided a thorough explanation of recurrent neural networks concepts, alongwith components like LSTMs and GRUs. They also discussed the resource theyāre learning from - the Andrew Ng Machine Learning specialization. User.memoshi
contributed by sharing a paper on the SwitchHead method on arXiv. -
Space-Themed Twitter Account & Inclusive AI:
@boggdanu_
sharing their twitter account @madpapayas focused on astronomy, and@sk21_
discussing about inclusive AI design stating a post by Dr Maria Panagiotidi. -
AI Related Projects & Tutorials: Users sharing their AI projects and asking for feedback.
@marielandryceo
presented methodologies of AI,@rwitz_
introduced the merge of two AI models and@appstormer_25583
unveiled an AI that analyzes logos. A query regarding fine-tuning was resolved. -
MagVIT2 Presentation & New Reading Material Suggestions:
@chad_in_the_house
introducing MagVIT2 and sharing associated resources,@memehunter7209
suggesting Mathematical Machine Learning as a potential reading group book and lastly@netskink.1
inviting members to participate in their dataset project. -
Image-Conditioned Diffusion Model & Other Discussions:
@mr.frog2014
sharing insights from their experiments with image conditioned Diffusion Models.@nixon_88316
seeking information about stablecode and@sebys7
asking about parameters tweaks in SD-X4-Upscaler. -
PyTorch Model Training & Hierarchical Text Classification: User
@merve3234
providing instructions on transformer models training with PyTorch and@stroggoz
discussing Hierarchical Text Classification using separate models for each level. Also,@ppros666
soliciting advice on fine-tuning an LLM for the first time and highlighting a guide on the same.
HuggingFace Discord Channel Summaries
ā· #general (82 messagesš„š„):
- NeurIPS Meetup: User
@.yosun
is calling for last-minute attendees to a NeurIPS meetup and an AI3D breakfast the next day. Shared a link to twitter post. - Working with LSTM: User
@vipitis
explained the functionality of LSTM and how multi-layered LSTM works. Suggested to check out the PyTorch LSTM documentation for understanding its implementation. - Customizing AI Name in LLAMA: Users
@criticaldevx
,@jeffry4754
,@shrap42
,@ahmad3794
had a conversation on how to change the AI name in LLAMA chatbots. Suggestions include fine-tuning and replacing the name post-generation using a script. - Training Trouble:
@vishyouluck
raised an issue with fine-tuning a model usingautotrain_advanced
, suspecting a problem with thetrain.csv
file. - VSCode Plugins:
@funapple
asked for recommendations on VSCode plugins that work well with local Language Models (LLM) for real-time code suggestions.
ā· #today-im-learning (6 messages):
- Understanding Recurrent Neural Networks: User
@nerdimo
delved into the concepts of recurrent neural networks (RNNs), including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). They explained that LSTMs and GRUs employ gates to update their internal memory and process relevant information, which aids in making accurate predictions and avoiding the vanishing gradient problem. - In additional discussions,
@nerdimo
mentioned the concept of bidirectional RNNs, where information flow can proceed from left to right and vice versa. Moreover, they discussed the power (and computational cost) of stacked RNNs that form a deep RNN. - User
@merve3234
questioned if there is a significant difference in performance between GRUs and LSTMs, given that GRUs are generally more efficient to train. They also asked about the resources@nerdimo
is using for learning. - In response,
@nerdimo
indicated that they are learning from the Andrew Ng Machine Learning specialization. They also expressed their intuition that LSTM would be the best choice due to its increased filtering and parameters. - SwitchHead Method for Transformers: User
.memoshi
shared a link to an arXiv paper about the SwitchHead method. It claims to reduce both compute and memory requirements of the self-attention layers in Transformers, achieving practical speedups without sacrificing language modeling performance.
ā· #cool-finds (2 messages):
- Astronomy Twitter Account:
@boggdanu_
shared a link to his Twitter account @madpapayas which is focused on astronomy. - Creating Inclusive AI:
@sk21_
brought to attention a Substack post about creating inclusive AI. Subtitled āA strengths-based approach to Artificial Intelligenceā, the post is by Dr Maria Panagiotidi and it discusses the issues of inclusivity in AI design. The link is as follows: Creating Inclusive AI.
ā· #i-made-this (5 messages):
-
Navigating the Depths: The Critical Thinking ToT and Scientific Method CoT in AI: User
@marielandryceo
shared an extensive overview of two essential methodologies guiding the world of Artificial Intelligence, namely the Critical Thinking Tree of Thoughts (ToT) and the Scientific Method Chain of Thoughts (CoT). The post detailed the step-by-step process of how these methodologies function in AI, emphasizing the significance of persistently refining our understanding. The discussion hashtagged#CriticalThinkingToT
,#ScientificMethodCoT
, and#AIExploration
. -
Merge of AI Models:
@rwitz_
shared a link to a merge of two AI models -viethq188/LeoScorpius-7B-Chat-DPO
andGreenNode/GreenNodeLM-7B-v1olet
- using slerp as a merge method. He detailed the slice sources and parameters involved in the merge process. The base model for the merge isviethq188/LeoScorpius-7B-Chat-DPO
. -
Brand Logo Analyzer GPT: User
@appstormer_25583
introduced an AI, which gives design feedback and improvement tips for logos based on the uploaded image. A link to their project was provided. -
@merve3234
asked@rwitz_
if he had submitted his merged model for the leaderboard, to which he affirmed with a āyesā.
ā· #reading-group (15 messagesš„):
- Presentation of MagVIT2: User
@chad_in_the_house
initiated a presentation on MagVIT2, a model introduced in the paper āMagVIT2: Language Model Beats Diffusion: Tokenizer is key to visual generationā by Carnegie Mellon and Google. The model can generate images as tokens/words and is seen to beat diffusion models. For more in-depth discussion, link to the blog was shared and code available here. - Suggestion for Reading Group Material:
@memehunter7209
suggested studying the mml-book in the reading group for a better understanding of the math stuff required for machine learning. - Discussion about mml-book:
@Pantera
and@charlieparker8035
sought clarification about the level and content of the book respectively, and@memehunter7209
advised that itās more of a review book and referred them to Gilbert strang lectures and course on EDx. - Project Invitation:
@netskink.1
invites members to participate in their project working on a dataset of images and weather conditions aimed at detecting icy bridges. - Suggestion on Unresolved Math Problem Solve:
@caleb_sol
proposed discussing the paper about AIās capacity to solve previously unsolved math problems, indicating it could be a good topic for the reading group.
ā· #diffusion-discussions (3 messages):
- Image Conditioned Diffusion Model: User
@mr.frog2014
shared their experiment with image conditioned Diffusion model, suggesting the idea of injecting noise directly to x and subtracting after denoise model. They also raised a question about the potential benefit of incorporating an attention module. - Query about StableCode: User
@nixon_88316
posed a query seeking information about stablecode. - Parameter Query for SD-X4-Upscaler: User
@sebys7
is using the SD-X4-Upscaler with diffusers, and asked about tweaks to the parameters that might yield a specific type of result.
ā· #NLP (6 messages):
- Model Training Tips: User
@merve3234
gave detailed instructions on using transformer models with PyTorch and decoder input tokens. They suggested manually feeding decoder_input_ids during the forward phase of the training loop. - Hierarchical Text Classification Discussion:
@stroggoz
proposed an approach to classifying academic texts using a network of smaller Bert models. The discussion raised the idea of classifying texts by primary topic (e.g., math, chemistry), then further classifying by subtopic, using separate models for each level. - LLM Context Length Clarification: User
@ppros666
asked for clarification on the context length of Llama 2, specifically asking if the context length is 4000 or 4096.@Cubie | Tom
clarified that the context length is 4096. Relevant information was linked to from the HuggingFace modelās config.json file. - Request for LLM Fine-Tuning Guidance:
@ppros666
expressed an interest in fine-tuning an LLM for the first time and requested an up-to-date tutorial or example code. They highlighted a guide titled āLlama-2 4bit fine-tune with dolly-15k on Colabā, seeking advice on its reliability.
ā· #diffusion-discussions (3 messages):
- Image Conditioned Diffusion Model: User
@mr.frog2014
discussed about a potential alteration to the standard image-conditioned diffusion model, presenting a method where the condition signal is added directly to the injected noise and subtracted subsequently after the denoise model. They also asked whether implementing an attention module to inject the condition would yield better results. - Inquiry about Stablecode:
@nixon_88316
posted asking if anyone has knowledge regarding stablecode. However, no responses or follow-ups were captured in the supplied message history. - SD-X4-Upscaler usage: User
@sebys7
asked if there are specific parameters using thesd-x4-upscaler
with diffusers that could cause certain outcome in the resulting image. However, no particular image or result was provided for reference within the given context.
LangChain AI Discord Summary
- Detailed discussions on LangChain integration and usage: official documentation, YouTube tutorials, creation of Language Learning Models (LLMs), output streaming, and implementation of metadata were extensively discussed.
- Queries raised on the progress of Plan-and-Execute, difficulties in copying text from LangServeās āDOCUMENTSā section, and accessing LangServe endpoints and fields led to collective troubleshooting and information exchange.
- Contributions shared included
@andysingal
ās Medium blog post on LangChain Expression Language (LCEL),@appstormer_25583
ās GPT-based brand logo analyzer,@pagerize_admin
ās Pagerize video summarizer,@gokusan
ās TinyLLM library for production applications, and@joshuasundance
ās pre-commit hook to detect API keys. - Guild members participated in a LLM user survey hosted by Catena Labs to provide insights on market preferences, accessible here.
LangChain AI Channel Summaries
ā· #general (48 messagesš„):
- Integration and Learning Resources for LangChain:
@chasemcdo
suggested utilising resources from the official LangChain site for documentation and its YouTube channel for further information and tutorials. - Plan-and-Execute Progress Inquiry:
@casper_mars
raised a question about the progress on Plan-and-Execute, to which@manlylubbin
replied they are still planning. - Implementation of Metadata on LLM:
@reddiamond69
highlighted that on generating output, LangChain allows for printing source documents along with their metadata, which can be implemented on the usersā application. - Difficulties with Streaming Output Through API:
@menny9762
shared struggles streaming the output through Next.js. A comprehensive discussion entailed, involving@seththunder
and others, on creation and use of Language Learning Models (LLMs), streaming and callback methods. - LLM Survey from Catena Labs: A community survey was promoted by
@jay_wooow
to gather data on LLM usage and preferences to inform product development and provide useful market data. The survey is hosted here.
ā· #langserve (3 messages):
- Copying Text from Output Section:
@fritz4374
reported difficulties in copying text from the āDOCUMENTSā section of a run. The user can copy from the input and output sections but not from the āDOCUMENTSā section. - LangServe Endpoint & Accessing Fields:
@khophi.co
is seeking assistance with accessing the<payload>.history
field in the langchain context when a frontend request, like{ 'mymessage': 'message' }
, is sent to thepath="/myendpoint"
in LangServe. The user is curious about how to retrieve other fields beyond the āinputā field, which LangServe does automatically.
ā· #share-your-work (7 messages):
- LangChain Expression Language (LCEL) Instructions:
@andysingal
shared a Medium blog post on mastering the LangChain Expression Language (LCEL), providing examples of its application in scraping Wikipedia. - GPT-based Brand Logo Analyzer:
@appstormer_25583
showcased a brand logo analyzer GPT that offers design feedback and suggestions for improvement based on an uploaded logo image. - Pagerize - AI Video Summarizer:
@pagerize_admin
presented Pagerize, an AI summarizer for YouTube videos. Included was an example summary of a Theory of Mind LangChain and Plastic Labs webinar, viewable here. - TinyLLM - LLM Library for Production Applications:
@gokusan
developed TinyLLM, a library for running LLM applications at scale. An example of creating and evaluating an agent can be viewed here. - Pre-Commit Hook to Detect LLM API Keys:
@joshuasundance
created a pre-commit hook to prevent developers from putting their API keys into source control.
ā· #tutorials (1 messages):
potatooff: https://www.youtube.com/watch?v=mrjq3lFz23s
DiscoResearch Discord Summary
- LLM-Assisted Scoring Evaluation Experiment: User
@_jp1_
initiated a discussion regarding an early-stage AI startupās LLM-assisted batch evaluation tool for highlight extraction. This sparked debates on reducing score and using special rating tokens for accuracy and discrimination āLink to the original postā. The conversation later branched out to potential collaboration and understanding the evaluation method featuring@bjoernp
and@_jp1_
. - Mixtral Implementation: The conversation in mixtral_implementation channel highlighted different elements of mixtral implementation including the best practices, access to high-end hardware, and expert selection strategies along with benchmark results.
- The main points here include a custom build of llama.cpp shared by
@kalomaze
and a tweet shared by@someone13574
suggesting that using 3 experts yields the best perplexity results.
- The main points here include a custom build of llama.cpp shared by
- OpenAIās Alignment Plan and Phi-2 Discussion: OpenAIās new paperās alignment plan and effectiveness of Phi-2 were the topics of discussion here. Also, there was a suggestion by
@flozi00
to implement scoring models to enhance data quality in Disco Research. - Progress on Simplified Language Understanding and Modeling (Llama) Integration and FastEval Evaluation: Conversation revolved around llama.cpp integration into projects, the completion of a substantial part of FastEval evaluation and problems related to llama-cpp-python. More specifically, new developments were noted such as the FastEval project fork on GitHub by Disco Research. Suggestions were provided by
@bjoernp
to@rtyax
for debugging simpler models and using a single thread for the local model.
DiscoResearch Channel Summaries
ā· #disco_judge (11 messagesš„):
-
LLM-Assisted Scoring Evaluation Experiment: User
@_jp1_
shared a post from Reddit about an early-stage AI startupās results using their LLM-assisted batch evaluation tool. The tool is used for highlight extraction from news articles and the MMLU benchmark using a scoring model. They referenced three 1, 2, 3 different research papers as the basis of their evaluation method. This led to a discussion about whether reducing score categories and using special rating tokens could influence accuracy and the ability to discriminate. Link to the original post -
Potential Collaboration: User
@bjoernp
expressed interest in the evaluation approach and suggested collaboration with the startup working on similar projects. -
Evaluation Method Understanding: A discussion between
@bjoernp
and@_jp1_
ensued concerning the circular nature of using benchmarks for evaluation models. They talked about how a good benchmark could negate the need for an eval model and how the use of evaluation models is often resorted to when there is no easy way to benchmark or it would be too much effort to create benchmarks for everything to measure. -
Confusion on Benchmarking and Evaluation Models:
@_jp1_
expressed confusion over the evaluation method and questioned if measuring MMLU of the eval model is similar to direct model evaluation.@bjoernp
acknowledged the similarity but noted a slight difference, stating that rating a response is an easier task. -
Other potential benchmark: In their conclusion,
@bjoernp
suggested that a held-out test set could potentially be a good benchmark for the rating model.
ā· #mixtral_implementation (29 messagesš„):
- Mixtral Implementation Best Practices:
@nul3
asked which is better between the instruct and non-instruct versions for chat.@goldkoron
recommended the instruct version. - Custom MoE Routing:
@kalomaze
shared a custom build of llama.cpp which is a modification to the Mixtral PR that lets users customize the amount of experts that are routed per token. - Access to High-End Hardware:
@tarikoctapm
offered access to a machine with 4 x RTX 4090, asking in return for the sharing of results from models. - Expert Selection Strategy Discussion:
@chrismcmaster
and@kalomaze
exchanged ideas on various expert selection strategies. They discussed hardcoding a set number of experts, āmin_pā strategy, and top-k experts.@kalomaze
elaborated on āmin_pā, which acts as a minimum probability threshold based on the maximum probability. - Benchmark Results:
@bjoernp
shared preliminary benchmarks which showed sub-optimal performance when hardcoding either 1 or 4 top-k experts.@someone13574
shared a tweet suggesting that using 3 experts yields the best perplexity results. These results were critiqued by@kenjiqq
who observed inconsistencies in the Q6 quant.
ā· #general (5 messages):
- OpenAIās Alignment Plan:
@_jp1_
brought attention to the interesting content in OpenAIās new paper regarding their alignment plan. No link was provided. - Discussion on Phi-2:
@bjoernp
and@flozi00
engaged in conversation about Phi-2, questioning its effectiveness and the legitimacy of its benchmarks. While@bjoernp
showed skepticism,@flozi00
shared some optimistic observations from Twitter examples, especially for smaller scale uses cases on edge. - Data Quality Scoring Models in Disco Research:
@flozi00
suggested the implementation of scoring models to enhance data quality in Disco Research. The aim is to not only deduplicate datasets, but also cleanse them of low-quality data like raw lists from Wikipedia or translation errors.
ā· #benchmark_dev (13 messagesš„):
- Collaboration on Simplified Language Understanding and Modeling (Llama) Integration:
@bjoernp
requested@rtyax
for collaboration on incorporating llama.cpp into their project and asked for clarity on the problems related to tokenizer that@rtyax
had mentioned earlier. - FastEval Evaluation Completion and Future Steps:
@bjoernp
outlined the next steps including adding the capacity for n repetitions of benchmarks and calculating the mean and standard deviation, thereby completing a substantial part of the evaluations. He also pointed out the continued need for the integration with llama.cpp to examine the effects of min_p and enabling grammar-based inference to fix the output format. - FastEval Repository on DiscoResearch GitHub:
@bjoernp
created a fork of FastEval with the DiscoResearch GitHub account for coordination, directing users and@rtyax
to submit their changes there. The repository is found at https://github.com/DiscoResearch/FastEval. - Issues with Llama-cpp-python:
@rtyax
reported that llama-cpp-python was failing silently when running generate/chat_completion. They noted that the difficulty may not be with the tokenizer, but could not identify the specific problem. They had recently rebuilt llama-cpp-python for mixtral, and were uncertain if the problem was local or more widespread. - Debugging Suggestions:
@bjoernp
suggested debugging simpler models first, like mistral-7b, and changing fasteval evaluation constants to use a single thread for the local model.@rtyax
said that only one GPU was being used in his box, but the generation was continuing as if it was never called due to a try/catch block.
Latent Space Discord Summary
- Announcement of Anthropicās new Google Sheets tool, Claude for Sheetsā¢, praised by
@swyxio
as āspreadsheets are the best prompt engineering toolā. Claude for Sheets⢠- In-depth coverage of Mamba model provided through a YouTube video shared by
@swyxio
. - Introduction to the Mistral-kit project which utilizes mistral-7b and ollama; repository shared by
@kevmodrome
on GitHub. Github link - Discussion around rumored release of GPT 4.5 initiated by
@swyxio
with a linked tweet. - Access to Mistral API provided to
@fanahova
and@coffeebean6887
; latter also noted the addition of Mistral as an endpoint on Anyscale. Anyscale URL - Announcement of the new podcast release by
@fanahova
on Twitter and Hacker News. (Twitter Link) - Mention of Qtransformers by
@swyxio
in the #llm-paper-club channel, without further elaboration or context given.
Latent Space Channel Summaries
ā· #ai-general-chat (9 messagesš„):
- Anthropicās New Google Sheets Tool:
@swyxio
mentioned a new product called Claude for Sheetsā¢, which can bring the helpful AI assistant Claude from Anthropic to Google Sheetsā¢.@swyxio
stated that āspreadsheets are the best prompt engineering toolā. - Discussion on Mamba Model:
@swyxio
shared a YouTube video that provides a thorough explanation of the Mamba model. - Mistral-kit Project:
@kevmodrome
shared a Github link to the mistral-kit project, which uses mistral-7b and ollama. - Rumors About GPT 4.5:
@swyxio
mentioned some rumors about GPT 4.5 coming soon. - Access to Mistral API: Both
@fanahova
and@coffeebean6887
indicated that they received access to the Mistral API.@coffeebean6887
also pointed out that Anyscale added Mistral as an endpoint and posted the URL for it.
ā· #ai-event-announcements (1 messages):
fanahova: New pod is out! https://twitter.com/FanaHOVA/status/1735371425836568905
Also live on HN
ā· #llm-paper-club (1 messages):
swyxio: Qtransformers
LLM Perf Enthusiasts AI Discord Summary
- In the general channel, two key discussions were highlighted:
- Issue with Model 3.5:
@0xmmo
reported unusual behavior from AI Model 3.5 involving an excessive production of new lines in responses. - The workload of ChatGPT during Finals Week was humorously speculated by
@res6969
, who suggested it could be overloaded due to increased usage by students.
- Issue with Model 3.5:
- The finetuning channel saw
@robertchung
offer advice for a text extraction task, recommending an initial 30-50 samples for fine-tuning, with the potential to add more examples if results are unsatisfactory. - In the opensource channel,
@robhaisfield
discussed a strategy to overcome rate limits by rotating among various fine-tuning providers but mentioned the challenge of accounting for the subtle behavioral differences typical of each providerās unique fine-tunes. - Useful resources were shared in the resources channel, such as an insightful Twitter link posted by
@nosa_
. - The openai channel showcased two interesting references:
@pantsforbirds
pointed out an OpenAI paper on supervising larger models with smaller ones.@firefox8975
highlighted Google AIās competitive capability in supporting function calls, referencing the official document.
LLM Perf Enthusiasts AI Channel Summaries
ā· #general (3 messages):
- Possible Issue with Model 3.5:
@0xmmo
reported that theyāve been getting strange responses from AI model 3.5, which has started to respond to their function calls with an excessive number of new lines. - AI Workload during Finals Week: In a lighter note,
@res6969
humorously speculated that ChatGPT might be overloaded due to āfinals week,ā a reference to a period of intense academic work. This was backed up by@potrock
, who mentioned that many people in a graduate-level Natural Language Processing (NLP) course are utilizing GPT-generated synthetic data for their final projects.
ā· #finetuning (2 messages):
- Text Extraction and Fine-Tuning:
@robertchung
suggested for a text extraction task, 30-50 samples may suffice for initial fine-tuning. They also suggested that if the results are not satisfactory, one can add more examples and proceed to fine-tune the already fine-tuned model.
ā· #opensource (1 messages):
- Discussion on Rotating Among Providers:
@robhaisfield
shared his thoughts on a strategy to power through rate limits by programmatically rotating among various fine-tune providers. He, however, expressed concerns that the subtle differences in behavior produced by each providerās unique fine-tunes might make it tricky to treat all providers as a commodity.
ā· #resources (2 messages):
- Sharing Useful Resources: User
@nosa_
shared a useful Twitter link with potentially valuable information for the community. User@ayenem
responded positively to the shared link.
ā· #openai (2 messages):
- Supervising Larger Models with Small Models: User
@pantsforbirds
shared an OpenAI paper about using small models (GPT-2) to supervise larger models (GPT-4). - Google AIās Function Calling:
@firefox8975
mentioned that Google AI supports function calling, which they found to be competitive with OpenAI during their exploration of it. They also provided a reference to Google AIās official document.
Alignment Lab AI Discord Summary
- [oo] channel featured updates on Depth Upscaling utilized by Upstage for the franken llama-mistral, boasting 10.7B parameters with the model SOLAR-10.7B-Instruct-v1.0. āUpstage claims to have surpassed Mixtral with their approach.ā
- New Open Source AI Grants were discussed in the same channel, with users congratulated on their grants via an announcement from a16z.
- [oo-priority] held details about a specific event from
@entropi
, although the details of the event are not provided. - In [phi-tuning],
@entropi
shared an update on the Phi-2 transformer model with 2.7 billion parameters and a link to more information. The modelās data was the same as Phi-1.5 but with additional new sources encompassing NLP synthetic texts and safety and educationally-filtered websites. Updates on the modelās weights were also mentioned.
Alignment Lab AI Channel Summaries
ā· #oo (2 messages):
- Depth Upscaling on Franken Llama-Mistral: User
@entropi
shared a link to an article about Upstageās use of āDepth Upscalingā on a franken llama-mistral to get to 10.7B params with continued pretraining. Upstage claims to have surpassed Mixtral with their approach. The end result is their model SOLAR-10.7B-Instruct-v1.0. - New Open Source AI Grants:
@entropi
also congratulated users <@257999024458563585> and <@503006108928180245> for their new AI grants, sharing the announcement from a16z.
ā· #oo-priority (1 messages):
entropi: @here https://discord.com/events/
ā· #phi-tuning (2 messages):
- Phi-2 Model Summary:
@entropi
shared a link about Phi-2, a Transformer with 2.7 billion parameters. The data used for its training were the same as Phi-1.5. Additionally, the Phi-2 model was also trained with a new data source consisting of various NLP synthetic texts and filtered websites for safety and educational value. - Phi-2 Weights Update:
@entropi
reported that the weights for Phi-2 were updated recently.
YAIG (a16z Infra) Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- AI Economics Discussion:
@stevekamman
shared insights on the economics of AI, specifically focusing on the comparison between training and inference cost and revenues. He said: āInference must pay the bills - GPU, S&M, overheadsā¦ā and pointed out that the revenue generated by inference must be noticeably higher than the cost associated with training to maintain profitability in the long term. - Consideration on Training Costs:
@stevekamman
noted that the cost incurred in training can be offset by selling the foundational models but he also pointed out that itās inevitable for the buyers of these models to fund the purchase either through revenue or efficiency gains. - Future of FM Companies:
@spillai
speculated on whether foundation model (FM) companies need to develop their own clouds to continue capturing enough long-term value from their customers and therefore attain profitability. They drew attention to the potential need for vertical AI infrastructure providers for large language models (LLMs). - Utilization of GPU by FM companies:
@stevekamman
questioned the utilization efficiencies of GPUs by FM companies, given their binary nature (100% or 0% utilization). This, considering how potential scale-based utilization efficiencies might be limited.
Skunkworks AI Discord Summary
- Open-source survey initiated by
@jay_wooow
to understand motivations and challenges in building with LLMs (Language Learner Models). The survey data will be used internally and to assist the wider AI community in development tool creation. - Raw survey data and key insights will be open-sourced in a report and published online upon reaching the target participant number.
- Link to the survey was shared. Participation keeps developer identities anonymous.
- Uncontextualized link to a YouTube video was posted by pradeep1148 under off-topic. Further discussion or context was not provided, thus relevance remains unclear.
Skunkworks AI Channel Summaries
ā· #general (1 messages):
- LLM Tools Usage Survey:
@jay_wooow
has initiated an open-source survey aimed at understanding the motivations and challenges involved in building with LLMs (Language Learner Models). The data gathered through this survey will be used for internal productivity enhancement and to assist the wider community with market data to optimize the creation of development tools. The survey should take approximately 10 minutes to complete. - Results Publication: Once the target number of participants has been reached, all raw data collected as well as the key insights will be open-sourced in a report and published on a blog platform. This can be a valuable resource for AI developers to understand goals, challenges, and tool usage in the community.
- Participation Link: The link to participate in this survey was shared. All participating developersā identities will remain anonymous.
ā· #off-topic (1 messages):
pradeep1148: https://www.youtube.com/watch?v=MOimHasrCKk
MLOps @Chipro Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- Mastering MLOps and ML Engineering: Key Strategies for 2024:
@amitqwak
announced a live session titled āMastering MLOps and ML Engineering: Key Strategies for 2024ā scheduled for January 17th 2024 at 11:30 AM EST. The session aims to provide organizations with advanced insights and strategies for effectively incorporating and managing AI and ML in their business frameworks, with a focus on MLOps and ML engineering trends. The event, primarily for ML Engineers, Data Scientists, Data leaders, and Software engineering managers is free of charge. The registration link is here.
- Arize Holiday Special:
@sarahwelsh
announced the Arize Holiday Special scheduled for December 15, 2023. The event comprises a series of live, virtual hands-on workshop sessions focused on prompt engineering, search and retrieval workflows, and LLM system evaluations. Speakers from Hugging Face, PromptLayer, Shopify, and Arize are participating. The registration link is here.
AI Engineer Foundation Discord Summary
Only 1 channel had activity, so no need to summarizeā¦
- Upcoming Event Notification:
._z
posted a notification for an upcoming event with the associated Discord link provided.
The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The Perplexity AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.