All tags
Person: "teknium"
Trust in GPTs at all time low
llama-3 mistral-medium llava-1.6 miquella-120b-gguf tinymodels miqumaid harmony-4x7b-bf16 smaug-34b-v0.1 openai hugging-face mistral-ai nous-research bittensor context-management fine-tuning model-merging quantization gpu-servers visual-reasoning ocr dataset-release incentive-structures nick-dobos manojbh teknium arthurmensch
Discord communities were analyzed with 21 guilds, 312 channels, and 8530 messages reviewed, saving an estimated 628 minutes of reading time. Discussions highlighted challenges with GPTs and the GPT store, including critiques of the knowledge files capability and context management issues. The CUDA MODE Discord was introduced for CUDA coding support. Key conversations in the TheBloke Discord covered Xeon GPU server cost-effectiveness, Llama3 and Mistral Medium model comparisons, LLaVA-1.6's visual reasoning and OCR capabilities, and the leaked Miqu 70B model. Technical topics included fine-tuning TinyLlama and MiquMaid+Euryale models, and model merging with examples like Harmony-4x7B-bf16 and Smaug-34B-v0.1. The Nous Research AI Discord discussed style influence in LLMs, quantization issues, Bittensor incentives for AI model improvements, and the identification of MIQU as Mistral Medium. The release of the Open Hermes 2.5 dataset on Hugging Face was also announced. "Discussions pointed towards the need for better context management in GPTs, contrasting with OpenAI's no-code approach."
1/11/2024: Mixing Experts vs Merging Models
gpt-4-turbo gpt-4-0613 mixtral deepseekmoe phixtral deepseek-ai hugging-face nous-research teenage-engineering discord mixture-of-experts model-merging fine-tuning rag security discord-tos model-performance prompt-engineering function-calling semantic-analysis data-frameworks ash_prabaker shacrw teknium 0xevil everyoneisgross ldj pramod8481 mgreg_42266 georgejrjrjr kenakafrosty
18 guilds, 277 channels, and 1342 messages were analyzed with an estimated reading time saved of 187 minutes. The community switched to GPT-4 turbo and discussed the rise of Mixture of Experts (MoE) models like Mixtral, DeepSeekMOE, and Phixtral. Model merging techniques, including naive linear interpolation and "frankenmerges" by SOLAR and Goliath, are driving new performance gains on open leaderboards. Discussions in the Nous Research AI Discord covered topics such as AI playgrounds supporting prompt and RAG parameters, security concerns about third-party cloud usage, debates on Discord bots and TOS, skepticism about Teenage Engineering's cloud LLM, and performance differences between GPT-4 0613 and GPT-4 turbo. The community also explored fine-tuning strategies involving DPO, LoRA, and safetensors, integration of RAG with API calls, semantic differences between MoE and dense LLMs, and data frameworks like llama index and SciPhi-AI's synthesizer. Issues with anomalous characters in fine-tuning were also raised.
1/9/2024: Nous Research lands $5m for Open Source AI
qlora phi-3 mixtral ollama nous-research openai rabbit-tech context-window fine-tuning synthetic-data activation-beacon transformer-architecture seed-financing real-time-voice-agents trillion-parameter-models kenakafrosty _stilic_ teknium
Nous Research announced a $5.2 million seed financing focused on Nous-Forge, aiming to embed transformer architecture into chips for powerful servers supporting real-time voice agents and trillion parameter models. Rabbit R1 launched a demo at CES with mixed reactions. OpenAI shipped the GPT store and briefly leaked an upcoming personalization feature. A new paper on Activation Beacon proposes a solution to extend LLMs' context window significantly, with code to be released on GitHub. Discussions also covered QLORA, fine-tuning, synthetic data, and custom architectures for LLMs.
12/25/2023: Nous Hermes 2 Yi 34B for Christmas
nous-hermes-2 yi-34b nucleusx yayi-2 ferret teknim nous-research apple mixtral deepseek qwen huggingface wenge-technology quantization model-optimization throughput-metrics batch-processing parallel-decoding tensor-parallelization multimodality language-model-pretraining model-benchmarking teknium carsonpoole casper_ai pradeep1148 osanseviero metaldragon01
Teknium released Nous Hermes 2 on Yi 34B, positioning it as a top open model compared to Mixtral, DeepSeek, and Qwen. Apple introduced Ferret, a new open-source multimodal LLM. Discussions in the Nous Research AI Discord focused on AI model optimization and quantization techniques like AWQ, GPTQ, and AutoAWQ, with insights on proprietary optimization and throughput metrics. Additional highlights include the addition of NucleusX Model to transformers, a 30B model with 80 MMLU, and the YAYI 2 language model by Wenge Technology trained on 2.65 trillion tokens. "AutoAWQ outperforms vLLM up to batch size 8" was noted, and proprietary parallel decoding and tensor parallelization across GPUs were discussed for speed improvements.