Frozen AI News archive

1/11/2024: Mixing Experts vs Merging Models

**18 guilds**, **277 channels**, and **1342 messages** were analyzed with an estimated reading time saved of **187 minutes**. The community switched to **GPT-4 turbo** and discussed the rise of **Mixture of Experts (MoE) models** like **Mixtral**, **DeepSeekMOE**, and **Phixtral**. Model merging techniques, including naive linear interpolation and "frankenmerges" by **SOLAR** and **Goliath**, are driving new performance gains on open leaderboards. Discussions in the **Nous Research AI Discord** covered topics such as AI playgrounds supporting prompt and RAG parameters, security concerns about third-party cloud usage, debates on Discord bots and TOS, skepticism about **Teenage Engineering's** cloud LLM, and performance differences between **GPT-4 0613** and **GPT-4 turbo**. The community also explored fine-tuning strategies involving **DPO**, **LoRA**, and safetensors, integration of RAG with API calls, semantic differences between MoE and dense LLMs, and data frameworks like **llama index** and **SciPhi-AI's synthesizer**. Issues with anomalous characters in fine-tuning were also raised.

Canonical issue URL

A bunch of MoE models have sprung up since the Mixtral architecture has been published - DeepSeekMOE, Phixtral. But equally interesting is the practice of "model merging" - from naive (spherical) linear interpolation to "frankenmerges" used by SOLAR and Goliath. It seems that these techniques have created a new growth spurt in the open leaderboards as even relatively naive implementations are handily beating vanilla incumbents from the big labs.

https://huggingface.co/blog/mlabonne/merge-models

image.png

--

Table of Contents

[TOC]

Nous Research AI Discord Summary

Nous Research AI Channel Summaries

▷ #off-topic (32 messages🔥):

Links mentioned:

Tweet from Rajesh Karmani -- acting fast and slow (@rkarmani): @Teknium1 @amasad Found the answer here. They use RPA on their cloud in virtual environments... similar to Mighty.

▷ #interesting-links (18 messages🔥):

Links mentioned:

▷ #general (204 messages🔥🔥):

Links mentioned:

▷ #ask-about-llms (30 messages🔥):

Links mentioned:

GitHub - SciPhi-AI/synthesizer: A multi-purpose LLM framework for RAG and data creation.: A multi-purpose LLM framework for RAG and data creation. - GitHub - SciPhi-AI/synthesizer: A multi-purpose LLM framework for RAG and data creation.


OpenAI Discord Summary

Additional Points & Community Inquiries:

OpenAI Channel Summaries

▷ #ai-discussions (80 messages🔥🔥):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

▷ #gpt-4-discussions (128 messages🔥🔥):

Links mentioned:

▷ #prompt-engineering (25 messages🔥):

▷ #api-discussions (25 messages🔥):


LM Studio Discord Summary

LM Studio Channel Summaries

▷ #💬-general (123 messages🔥🔥):

Please note that the above summary does not include every single message due to content and summary length restrictions.

Links mentioned:

▷ #🤖-models-discussion-chat (54 messages🔥):

Links mentioned:

▷ #🧠-feedback (1 messages):

▷ #🧪-beta-releases-chat (9 messages🔥):


HuggingFace Discord Discord Summary

HuggingFace Discord Channel Summaries

▷ #announcements (1 messages):

Note: The additional content on community discussions, blog posts, and acknowledgments of contributors was not included as bullet points due to the 5 bullet point constraint.

Links mentioned:

▷ #general (54 messages🔥):

▷ #today-im-learning (6 messages):

▷ #cool-finds (22 messages🔥):

Links mentioned:

▷ #i-made-this (7 messages):

Links mentioned:

▷ #reading-group (10 messages🔥):

Links mentioned:

Join the Hugging Face Discord Server!: We're working to democratize good machine learning 🤗Join us! hf.co/jobs | 66758 members

▷ #diffusion-discussions (5 messages):

▷ #computer-vision (2 messages):

Links mentioned:

▷ #NLP (8 messages🔥):

Links mentioned:

T5

▷ #diffusion-discussions (5 messages):


OpenAccess AI Collective (axolotl) Discord Summary

OpenAccess AI Collective (axolotl) Channel Summaries

▷ #general (16 messages🔥):

Links mentioned:

▷ #axolotl-dev (30 messages🔥):

Links mentioned:

▷ #general-help (16 messages🔥):

Links mentioned:

▷ #datasets (12 messages🔥):

Links mentioned:

▷ #rlhf (3 messages):


Eleuther Discord Summary

Eleuther Channel Summaries

▷ #general (28 messages🔥):

Links mentioned:

GitHub - wzzheng/OccWorld: 3D World Model for Autonomous Driving: 3D World Model for Autonomous Driving. Contribute to wzzheng/OccWorld development by creating an account on GitHub.

▷ #research (15 messages🔥):

Links mentioned:

▷ #scaling-laws (6 messages):

Links mentioned:

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism: The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark ...

▷ #interpretability-general (12 messages🔥):

Links mentioned:

Interpreting Deep Neural Networks through Model Transformation: Literature Review: Machine learning especially deep learning models have achieved state-of-the-art performances in many fields such as automatic driving, speech recognition, facial expression recognition and so on. Howe...

▷ #lm-thunderdome (9 messages🔥):

▷ #multimodal-general (1 messages):


LAION Discord Summary

LAION Channel Summaries

▷ #general (46 messages🔥):

Links mentioned:

Artists upset after Wacom uses AI art to market artist gear: Who needs a Wacom Intuos or Cintiq when you can have Midjourney crank it out? Well, you can use them to edit out the AI's hallucinations, mistakes and do compositing…

▷ #research (23 messages🔥):

Links mentioned:


Mistral Discord Summary

Emphasis on Technical Precision and Clarifications: Maintained a technical focus, ensuring to include specific model names, API parameters, and user handles for precision and direct follow-ups within the engineering audience.

Mistral Channel Summaries

▷ #general (23 messages🔥):

Links mentioned:

▷ #models (9 messages🔥):

Links mentioned:

Kquant03/Hippolyta-7B-bf16 · Hugging Face

▷ #finetuning (1 messages):

▷ #random (3 messages):

▷ #la-plateforme (19 messages🔥):

Links mentioned:

Guardrailing | Mistral AI Large Language Models: System prompt to enforce guardrails


Latent Space Discord Summary

Latent Space Channel Summaries

▷ #ai-general-chat (52 messages🔥):

Links mentioned:

▷ #llm-paper-club (1 messages):

Links mentioned:

Tweet from Ivan Leo (@ivanleomk): MOE models seem to overfit more heavily than their dense counterparts but train significantly faster. MOE-Mamba for instance trained ~2.2x faster. This means that training is fast but fine-tuning is ...


LlamaIndex Discord Discord Summary

LlamaIndex Discord Channel Summaries

▷ #blog (4 messages):

Links mentioned:

Building your own RAG application using Together AI and LlamaIndex

▷ #general (48 messages🔥):

Links mentioned:


DiscoResearch Discord Summary

DiscoResearch Channel Summaries

▷ #mixtral_implementation (8 messages🔥):

Links mentioned:

▷ #general (5 messages):

Links mentioned:

Tweet from Bo (@bo_wangbo): Chinese-English bilingual model available on API, German-English model coming next week, and we are syncing with HF team to make both models seamless integrated into the upcoming long waited sbert rel...

▷ #benchmark_dev (15 messages🔥):

Links mentioned:

Reddit - Dive into anything

▷ #embedding_dev (5 messages):

Links mentioned:


LangChain AI Discord Summary

LangChain AI Channel Summaries

▷ #general (17 messages🔥):

▷ #langserve (10 messages🔥):

Links mentioned:

How to make the new variables input available via query method? · langchain-ai/langserve · Discussion #394: Question: If I create the new varialbels: input_variables=["history", "input","lession", "affection"], and setting like the below code. I cant make the right qu...

▷ #share-your-work (1 messages):


LLM Perf Enthusiasts AI Discord Summary

LLM Perf Enthusiasts AI Channel Summaries

▷ #rag (1 messages):

robhaisfield: Anyone have great resources on query expansion?

▷ #openai (24 messages🔥):

Links mentioned:

Tweet from Aviv Ovadya 🥦 (@metaviv): Uh oh. This looks bad. OpenAI will pay those who create the most engaging GPT's. This makes their incentives very close to those of social media—capturing attention. This could get dystopian very ...


Alignment Lab AI Discord Summary

Alignment Lab AI Channel Summaries

▷ #general-chat (1 messages):

▷ #open-orca-community-chat (1 messages):


YAIG (a16z Infra) Discord Summary

Only 1 channel had activity, so no need to summarize...

Links mentioned:

GroqChat


The Skunkworks AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Datasette - LLM (@SimonW) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.