All tags
Person: "kenakafrosty"
1/11/2024: Mixing Experts vs Merging Models
gpt-4-turbo gpt-4-0613 mixtral deepseekmoe phixtral deepseek-ai hugging-face nous-research teenage-engineering discord mixture-of-experts model-merging fine-tuning rag security discord-tos model-performance prompt-engineering function-calling semantic-analysis data-frameworks ash_prabaker shacrw teknium 0xevil everyoneisgross ldj pramod8481 mgreg_42266 georgejrjrjr kenakafrosty
18 guilds, 277 channels, and 1342 messages were analyzed with an estimated reading time saved of 187 minutes. The community switched to GPT-4 turbo and discussed the rise of Mixture of Experts (MoE) models like Mixtral, DeepSeekMOE, and Phixtral. Model merging techniques, including naive linear interpolation and "frankenmerges" by SOLAR and Goliath, are driving new performance gains on open leaderboards. Discussions in the Nous Research AI Discord covered topics such as AI playgrounds supporting prompt and RAG parameters, security concerns about third-party cloud usage, debates on Discord bots and TOS, skepticism about Teenage Engineering's cloud LLM, and performance differences between GPT-4 0613 and GPT-4 turbo. The community also explored fine-tuning strategies involving DPO, LoRA, and safetensors, integration of RAG with API calls, semantic differences between MoE and dense LLMs, and data frameworks like llama index and SciPhi-AI's synthesizer. Issues with anomalous characters in fine-tuning were also raised.
1/9/2024: Nous Research lands $5m for Open Source AI
qlora phi-3 mixtral ollama nous-research openai rabbit-tech context-window fine-tuning synthetic-data activation-beacon transformer-architecture seed-financing real-time-voice-agents trillion-parameter-models kenakafrosty _stilic_ teknium
Nous Research announced a $5.2 million seed financing focused on Nous-Forge, aiming to embed transformer architecture into chips for powerful servers supporting real-time voice agents and trillion parameter models. Rabbit R1 launched a demo at CES with mixed reactions. OpenAI shipped the GPT store and briefly leaked an upcoming personalization feature. A new paper on Activation Beacon proposes a solution to extend LLMs' context window significantly, with code to be released on GitHub. Discussions also covered QLORA, fine-tuning, synthetic data, and custom architectures for LLMs.